Traces on the Archive — a Hybrid Lecture Player

Traces on the Archive
—a Hybrid Lecture Player

Player: https://mcluhan.consortium.io/
Software: https://github.com/consortium/hybrid-lecture-player

Introduction

A research case study focused on traces on the archive, revealing the hidden journey of a user through an archive, based on the Marshall McLuhan collection at the McLuhan Salon, Canadian Embassy, Berlin. The case study by the Hybrid Publishing Consortium (HPC), an interdisciplinary research group, investigates the future of publishing and user engagement with museums, archives and libraries. HPC is dedicated to Open Source software development and enabling crossmedia interoperability.

In the following case study we examine the lengthy video documentation of an insightful lecture by the historian and curator Graham Larkin, held at the Canadian Embassy in Berlin during 2011, and unpack its sections and layers to transform them into a hybrid lecture that allows new access to the lecture in revealing and exciting ways. The lecture is a perfect fit for our case study as it uniquely covers Marshall McLuhan's extensive, experimental and avant-garde media practice that usually goes unnoticed next to McLuhan's theoretical oeuvre. It is also a perfect example for tracing a scholar’s personal journey through an archive - in this case, Graham Larkin’s journey through the Marshall McLuhan fonds in Library & Archives Canada, Ottawa.

The Dewline Newsletter II/3
Figure: The Dewline Newsletter II/3 ("The End of Steel and/or Steal: Corporate Criminality Vs. Collective Responsibility" (Nov-Dec 1969)) which included the famous card deck

This case study is part of a broader investigation of the HPC concerned with tracing the use of an archive and publishing from archives. In addition to making multimedia content practically available through digital technologies - such as the Hybrid Lecture Player - and across platforms, the study also develops concepts that provide a variety of methods and examples of how one can begin accessing an archive.

By "traces on the archive" we refer to the traces of user activity in the archive which track the search and use of artifacts, together with the trains of thought of a user, and which are recorded by means of more elaborate annotation. These traces highlight the hidden parts of an archive, as well as the visitor’s journey through a repository. The underlying aim is to create a way of allowing experts, groups and novices to show their pathways through the archive, in addition to the work carried out by the formal or official archivist or collection team (Archive Manifesto, Hui 2013). As a result of this innovation a user can author a pathway through the collection, and we regard these pathways as a new type of publication - as an extra layer to the archive, in the same kind of way that users can create layers on OpenStreetMap’s base city maps.

an additional layer on the archive
Figure: Example for an added layer on a default map. Our additions to the archive can be understood as such, an additional layer on the archive. Taken from: https://visualisingadvocacy.org/sites/drawingbynumbers.ttc.io/files/35_0.jpg. Tactical Tech Collective https://www.tacticaltech.org/

While we look at a variety of ways to make these use-pathways visible and manifest as publishing resources we have two guiding questions:

  1. How can a multimedia repository become meaningful to new audiences? How can these traces become maps for users in the future and as educational tools for the public programming of an archive? Our point of departure is the conviction that information has to circulate in order to remain within contemporary culture and discourse (archive2020, Dekker 2009). In order to remain in circulation content must be digitally permeable and malleable, which requires involving metadata and collection management tools (the record keeping, indexing and organisation of an archive).
  2. What needs to happen in the translation process? This includes translation of event formats, online migration, or translation from one medium to another — video documentation to digital transcription — in order to see how these can be re-utilized in a digital context. What kind of digital publishing strategies and user engagement concepts can be developed to make a visit to an archive worthwhile and sustained? We are looking here particularly at existing strategies of the archive itself such as, for example, the case of the McLuhan Salon the Centennial (2011) event organized by Stephen Kovats and the McLuminations (2011-ongoing) event series by Baruch Gottlieb and Steffi Winkler (McLuminations, Winkler, Gottlieb 2014). As part of this we examine the role of the archive user as a newly formed protagonist - as a ‘user-archivist’.

So far we have developed a few methods and approaches to work with the archive. One example is a phased approach to user engagement with the archive. That means moving from social-media bookmarking to creating playlists, up to authoring new content. Another approach is to modularise expert lectures on an archive and make the lecture accessible to the public via the Open Web and open standards. These methods and approaches are outlined in the following detailed section on the case study.

Hybrid Lecture Player interface
Figure: Hybrid Lecture Player interface showing clipping of McLuhan in his office (left), Graham Larkin presentation video (right), presentation transcript full text (bottom left) and contextual link right of ‘Lenny Bruce’ (bottom right)

Case Study: Hybrid Lecture Player: Finally getting the Message

In order to provide a meeting point for researchers, scholars and software developers alike, we hosted an initial workshop at the McLuhan Salon, held in the Canadian Embassy in Berlin on November 26 in 2014 (Traces on McLuhan: http://www.consortium.io/traces-mcluhan>). The workshop was set up as a rapid prototyping session to establish how to move forward with the concept of an Archive Sprint and publishing from the archive (Archive Architectures, Worthington/Zehle/Cornwell/van Mourik Broekman 2015). Over a full day we collected input and facilitated a discussion between archiving and publishing technologists, as well as McLuhan scholars, while also sharing our experiences and ambitions for activating archives.

Archive Sprint workshop, McLuhan Salon Canadian Embassy, Berlin
Figure: Group from the November 2014 Archive Sprint workshop, McLuhan Salon Canadian Embassy, Berlin. http://www.consortium.io/mcluhan-media-sprint-recap-outlook

During the workshop it was proposed to work with Graham Larkin’s presentation, a one hour forty-two minute long lecture that was held in 2011 as part of the McLuhan Centennial event organized by Stephen Kovats and hosted at the McLuhan Salon in Berlin. Graham’s lecture drew attention to McLuhan’s extensive experimental media and publishing practice, which is vastly overlooked. Graham accompanied his talk by a set of three hundred and sixty-six slides and hand-picked audio and video footage. The lecture was video recorded. Alas this recording is not able to show the richness of the presentation, missing detail from the slides for example. In short, this documentation has become an artefact but as an educational tool for future scholars it is not very useful.

Therefore the question for us was, if a lecture video has valuable insights into an archive then how can we reveal the knowledge contained in the video and encourage users to watch it to engage further with that archive?

In our initial assessment we unpacked the lecture, highlighting the valuable elements. We transcribed the lecture and added subtitles to the video using the Open Source platform Amara.org http://amara.org (a.k.a. Universal Subtitles). We transformed the transcription into a prose text, following a style guide, the Presidential Oral History, http://millercenter.org/oralhistory/styleguide.

We re-introduced the topic sections that Graham had used in his talk. Graham provided us with the original slides. These slides were time coded so they could be synced with the video and the text. This is a time-consuming process and it takes a number of contributors, but it is easy to learn and execute (transcribe video captions, edit the transcript into prose, timecode the slides and timecode external media/links). Once the time coding was done, the Hybrid Lecture Player software synced the separate parts — video, captions, transcript as prose text, slides and external media/links - using a Github repository. Going through the process to identify a reusable workflow, and to document and design it to be as simple as possible, is very important in order to make this process a reusable method of enquiry.

Time code functionality of the Hybrid Lecture Player
Figure: Time code functionality of the Hybrid Lecture Player and how it simply combines video, subtitle captions, text, external media and links in a synchronised collection using time codes. All in an easy to author way. See the real thing here: https://github.com/consortium/traces_on_mcluhan/tree/gh-pages/hybridvideo/data

We developed a concept of how the lecture media could be formatted and displayed to become a ‘reading environment’ that could be of use to anyone interested in McLuhan’s media practice who was not at the lecture.

We developed the Hybrid Lecture Player that would combine these components on one browser page. The page has a header with the Hybrid Lecture title, below two menus offering access via sections or via publications. If you do not want to view the lecture from beginning to end you can select a section and jump straight to it. Each section is briefly described in a window next to the menu. The video is displayed alongside the deck of slides; you can view them in a split screen so that when playing the video you can watch the slides change on the left in sync with the video on the right. The video itself contains subtitles that can be edited. Currently they are in English only, but will be available in French and German in the near future. Below the slides and the video you will find the transcription window. An additional screen, bottom right, is reserved for external footage such as video clips and supporting background information or links.

The transcription that has been previously transformed/edited into prose text can be synced with the video and then progresses along with the video — the section that is referred to in the video is highlighted through underlining. The viewer has the option to download this text (alongside a selection of slides) as multi-format publication. The conversion that is employed uses A-machine (a modular publishing software ecosystem from the HPC) and Transpect (a multi-format publishing transformation software from the company le-tex).

The footer of the browser contains an 'About', outlining the case study, further research links related to the case study and the full credits.

The lecture player is developed with Famo.us, an Open Source smooth playback render engine for video and browser graphics, ideal for desktop and mobile HTML5 applications.

In this process we have identified a collection management tool set to which we plan to connect our prototype in the near future. This includes freizo from the Data Futures Lab and Tamboti from Heidelberg Research Architecture, as well as video archive management from Pandora. All of the collection management tools use the established metadata frameworks and controlled vocabularies, e.g. Visual Resources Association (VRA) and Metadata Object Description Schema (MODS) from the US Library of Congress. In addition to collection management we also use a publishing model of the Academic Book of the Future (Archive Architectures, Worthington/Zehle/Cornwell/van Mourik Broekmen 2015) and the Dynabook (A Personal Computer for Children of All Ages, Kay 1972) model by Alan Kay. The latter two parts are brought together by our multi-format publishing framework A-machine and le-tex software (Please see detailed description in the Window section).

Conclusion & Next Steps

Conclusion

As researchers and publishing developers we seek to conceptualize strategies that make publishing from archive repositories future savvy. This involves paying close attention both to existing and historical forms of user engagement with archival material, and to how information can be made accessible and relevant to diverse audiences. One of our goals is to facilitate/support the transforming role of the user, from a restricted consumer to an agent or a chief actor — which implies a new understanding of what using an archive means, as well as a new literacy around how to access and work with archival material (Free, Rogoff 2010). Consequentially, this means not necessarily serving an institutional model but defining methodologies of one's own (Archive Manifesto, Hui 2013) and publishing in new ways (Traces on the Archive - Dossier, Kral, Worthington, 2015).

So to return to our guiding questions: 1) How can a multimedia repository become meaningful to new audiences?, and 2) What needs to happen in the translation process? We translated the original video documentation of Larkin’s lecture into a multimedia player. This player breaks up the content into smaller parts and adds a signage system to allow the user to enter the documentation at various points. The video was enriched with supporting/additional elements such as original slides, transcription, subtitles and translations to grant greater comprehension of and insight into the original lecture.

We took the opportunity to work with this lecture as an example of an existing ‘publication’ format of the McLuhan Salon at the Canadian Embassy, Berlin — this proved a good way to analyse what has already been done on the part of the institution to engage users with the archive. And by tracing a user’s journey through the archive (in this case Larkin’s) we practically investigated how this media package could be transformed into a useful educational resource.

This approach is very representational of our understanding of research that engages in a practical form of inquiry. The process usually involves a diverse group of contributors, in order to combine imagination and appropriation with straightforward software development (Sommerville 2011) and implementation in real situations/case studies. As a result such a focus places something concrete, but not finished, into the ongoing discourse on the ‘future of publishing’ and allows ideas to permeate, feedback, transform, adapt, integrate and develop further. In fact, every finalised case study yields further steps that have to be taken in our research.

Next Steps

In this case, the next steps will be to connect the demo of the Hybrid Lecture Player to collection management and digital publishing. The current demo has been built as a stand alone web presentation, but our ambition is to connect it to archive management software to allow for easier authoring by users, further enabling the user to move from being a consumer to become an author of a hybrid lecture her/himself. Below is our roadmap for the project.

Trace on the Archive roadmap
Figure: Trace on the Archive roadmap. Moving from stand alone Web presentation publishing demo to connected archive publishing Open Source framework.
  1. Hybrid Lecture Player Demo Release - the initial demo is to act as a testing prototype for reflection and further design to work on: interfaces, how to augment media and technical implementations for mobile; web; tablet; and print publishing.
  2. Hybrid Lecture Player - Open Source Release - the Open Source release allows testing of two parts of the framework: a) How to design the framework as a minimal setup for someone to add shorter video work, and b) To gauge the feasibility for an editor to create their own publication, which involves editing the time code files to sync text, video and external links.
  3. Multi-format Publishing - there are two main goals. To release the Hybrid Lecture Player as a software framework and as a module of the larger A-machine publishing ecosystem. Produce prototype research publication for the development of processes, methods and case studies. There will be an emphasis on the Hybrid Publishing Consortium’s model of interoperability, standards, design methods and working with existing Open Source expert projects.
  4. McLuminations 2011 - an accompanying publication by Steffi Winkler and Baruch Gottlieb covering the 2011 McLuminations series. This demo publication would focus on the eBook format and the Transpect multi-format publishing software workflow.
  5. Archive Digitization - full archive collection management and compatibility with the A-machine structured data multi-format publishing framework. This includes examples such as: Heidelberg Research Architecture tools, including Tamboti, Pandora and Scanning; and Data Futures’ Freizo software for asset management. All of the above in accordance with VRA/MODS US Library of Congress metadata compliant standards and using controlled vocabularies.
  6. Collaborative Publishing Layers - Traces on the Archive publishing tool kit combines two closely related models of publishing Archive Architectures and the Dynabook. Some practical examples of publishing functionality are: A. Social media bookmarking, B. Playlists, C. Authoring new content, D. The scholar's note book and archive tracer. This also envisages case studies of ‘publishing from the archive’ in the Open Education Resources (OER) sector.

Credits

This Case Study is a project by the Hybrid Publishing Consortium (HPC), a research project with a mission to support the development of Open Source software for public infrastructures in publishing. HPC examines publishing workflows and builds software components that enable multi-format conversion, e.g. eBooks, print-on-demand, online learning systems etc, and develops concepts for new forms of access to information. Within its research HPC specialises in Museums, Libraries and Archives (MLA) publishing. HPC is part of the Hybrid Publishing Lab at the Centre for Digital Cultures, Leuphana University, Lüneburg, Germany and is funded by the European Union and the German Federal State of Lower Saxony. Also thanks to our collaboration partner on the project, the Marshall McLuhan Salon, Canadian Embassy, Berlin.

Hybrid Lecture Player

Video, text and image credits

All images, video and text are under the respective copyright holders' copyright terms.

Copyright

All video, text and images remains the copyright of the respective authors.

Bibliography

[1] archive2020, Dekker 2009 “ISSUU - Archive2020 - Sustainable Archiving of Born-Digital Cultural Content by Virtueel Platform 2009-2012.” Accessed April 21, 2015. http://issuu.com/virtueelplatform/docs/archive2020_def_single_page.

[2] Archive Manifesto, Hui 2013 - Hui, Yuk. “Archivist Manifesto” 2013. http://www.metamute.org/editorial/lab/archivist-manifesto.

[3] Archive Architectures, Worthington/Zehle 2015 Worthington, Simon, Soenke Zehle, Peter Cornwall, and Pauline van Mourik Broekman. “Archive Architectures.” Network Ecologies Digital Scalar Publication, January 2015. http://sites.fhi.duke.edu/ecologyofnetworks/category/networkecologies-scalarpublication/.

[4] A Personal Computer for Children of All Ages, Kay 1972 Kay, Alan. “A Personal Computer for Children of All Ages. Alan Kay 1972.” In A Personal Computer for Children of All Ages. Boston, 1972. http://www.mprove.de/diplom/gui/kay72.html.

[5] Rogoff, Irit. “Free.” e-flux, 2010. http://www.e-flux.com/journal/free/.

[6] Traces on McLuhan http://www.consortium.io/traces-mcluhan “Traces of McLuhan | Hybrid Publishing Consortium.” Accessed April 22, 2015. http://www.consortium.io/traces-mcluhan.

[7] Winkler, Steffi, and Baruch Gottlieb. “MCLUMINATIONS - AN INTRODUCTION TO THE MARSHALL MCLUHAN ARCHIVE,” 2014.

[8] “Software Engineering (9th Edition): Ian Sommerville: 9780137035151: Books - Amazon.ca.” Accessed April 29, 2015. http://www.amazon.ca/Software-Engineering-9th-Edition-Sommerville/dp/0137035152.

[9] Traces on the Archive - Dossier, Kral, Worthington, 2015.

Windows | Hybrid Publishing Consortium, Partners and Software

Hybrid Publishing Consortium

At the Hybrid Publishing Consortium (HPC) we pursue a model of digital interoperability, emphasising frameworks as opposed to platforms, that combines ISO standards with experimental approaches to dynamic publishing (A Personal Computer for Children of All Ages, Kay 1972), combining existing Open Source expert projects. Around these Open Source projects we engineer a software ecosystem, called A-Machine, that allows us to create numerous solutions connecting different software modules for a variety of given publishing workflow situations.

A-Machine can be briefly described as HPC’s software ecology that provides publishers with the latest Open Source multi-format workflows. Its research involves three tracks;

  1. Examining publishing workflow and identifying Open Source projects to support these varied workflows;
  2. Building software for multi-format transformation based on a ‘single source’ architectures;
  3. Promoting an 'application framework' and API model to allow interoperability between Open Source providers.

These three tracks feed into the umbrella theme of 'Designing the Book of the Future', which encapsulates our research ambitions to enhance the technology of Moveable Type. The HPC research on the development of A-machine is carried out by running four case studies, each focusing on different aspects of digital publishing development.

This involves collection management, hybrid media annotation, automatization, markup standards, single-source publication conversion in multiple output formats etc. We are determined to keep this concept modular so it can flexibly grow and accommodate different needs and types of publishing projects, and be scaleable. The work on this lecture tool is the beginning for a further investigation of how to make Open Source lecture tools, such as Matterhorn, Edex etc., better and suit different audiences.

Building blocks of the A-machine archive publishing workflow
Figure: Building blocks of the A-machine archive publishing workflow; digitisation; fitting to the skills of the users; the trace as an object-in-the-archive; and modes of publishing from the archive.

Partners, Co-creators and Software

Avco

Interactive software development, specialising in visualisations and using Famo.us technologies
http://avco.com

Avco
Figure: Avco / Aldeburgh / The Space ‘Musicircus’ app 2014. http://musicircus.thespace.org/?nocols=3%26hud=1

Avco’s recent software development has been focused on experimental data interfaces. In celebration of John Cage’s centennial (2014) celebrations Aldeburgh Music and arts organisation The Space commissioned Avco to develop the ‘Musicircus’ app (http://musicircus.thespace.org/?nocols=3%26hud=1) - an online experience of a John Cage inspired happening on at Aldeburgh, the home of Benjamin Britten, on the Suffolk coast, England.

In 2009 Avco created the 'Music Map’ (http://www.nmcrec.co.uk/musicmap) for the avant-garde NMC music publisher to promote their catalog of artists. In 2015 Avco will be creating a new version, using open source javascript libraries and open standards to ensure the map's future viability.

Avco was established in 1997 by Daniel Jackson and Tina Spear.

freizo

Data Futures, Institute for Modern and Contemporary Culture, Westminster University
http://www.data-futures.org/

freizo
Figure: Collection implementation by data futures. The University of Westminster's Chinese Poster Collection is a unique archival collection of some 800 posters spanning the period between the late 1950s and the early 1980s. http://chinaposters.westminster.ac.uk/

freizo is an object-oriented sustainable archive platform developed by Data Futures Lab at the Institute for Modern and Contemporary Culture, Westminster University, England, which has been used both to create new community workflows in major projects such as Princeton's Phono-Post, as well as reclamation of legacy archives that have become inaccessible because of obsolescent technology.

Heidelberg Research Architecture

Software ecosystem for advanced multi-lingual Digital Humanities collection management

http://www.asia-europe.uni-heidelberg.de/en/hra-portal.html

Heidelberg Research Architecture
Figure: Heidelberg Research Architecture (HRA) portal, with links to major collections, guides and Open Source tools. Example collections include: Digital Corpus of Sanskrit; Global Politics on Screen; and Video Annotation Database

Under the umbrella of the Heidelberg Research Architecture (HRA), Heidelberg University, Germany, IT scientists, software developers, database architects, academically trained professionals in e-learning as well as IT support staff collaborate with researchers and students at the Cluster for Excellence “Asia and Europe in a Global Context“ to form an integrated digital humanities environment for interdisciplinary and internationally distributed studies of transcultural dynamics.

The HRA's agenda thus covers the entire range from developing sophisticated metadata frameworks for making transcultural relations traceable and analyzable across diverse media types (texts, images, films, audio) and disparate objects (texts, concepts, social networks), to offering IT support for the Karl Jaspers Centre, Heidelberg University.

Tamboti by Heidelberg Research Architecture

Digital collection management using US Library of Congress meta description frameworks

http://www.asia-europe.uni-heidelberg.de/en/hra-portal.htm

Tamboti
Figure: Tamboti interface for meta description editing and collection management. Such tools are needed to create machine readable and exact records for scholarly research. Tamboti has been created through experiences of managing over ninety world class collections at Heidelberg University, Germany

Tamboti is an application for working with metadata based on the MODS, VRA and Text Encoding Initiative (TEI) standard.

MODS is a standard used to catalogue books, articles and other traditional library material, as well as visual or online material such as images, videos, web sites or other sources.

Tamboti is similar to the public interface of the Online Public Access Catalog (OPAC) of the library systems used by national and university libraries. Tamboti can process a large amount of records.

In Tamboti, you can annotate records with metadata concerning language, script and transcription. This function is important for records used in multilingual contexts.

Tamboti is based on the eXist Native XML database. As eXist itself, Tamboti is Open Source and free. The eXist database system can be installed with a standard application installer.

Transpect, le-tex

Document transformation for multi-format publishing
https://www.le-tex.de/en/transpect.html

transpect
Figure: example Transpect document transformation workflow, in this case showing a Docx transformation to InDesign

As the company’s name implies, le-tex has a background in mathematical typesetting with TeX. Since its inception in 1999, le-tex has been active in the field of structured information, particularly SGML, XML, and HTML, with a focus on modelling, converting, checking, and rendering scholarly content. Since 2008, le-tex has been developing conversion tools for XML-based file formats on the basis of the XSLT 2.0 programming language, a W3C standard. The XML-based file formats include OOXML (.docx), IDML (InDesign), and EPUB. In 2012, le-tex published its conversion and checking libraries and their configuration methodology as a software framework, called Transpect, under a BSD Open Source license. In addition to being Open Source, Transpect employs only standardized technologies – namely XSLT 2.0, which is orchestrated into more complex workflows by XProc scripts – XProc also being a W3C standard.

HPC has been working with le-tex and has a public repository where the Transpect software can be installed from here https://github.com/consortium/BinB

pan.do/ra

Video archive, collection and curatorial management
http://pan.do/ra

Pandora
Figure: Pandora archive annotation software interface showing a selection of videos from the 42 video held in the McLuhan Salon’s collection in Berlin. Created at an HPC workshop in 2014 at the Canadian Embassy, Berlin.

pan.do/ra is a free, open source media archive platform. It allows you to manage large, decentralized collections of video, to collaboratively create metadata and time-based annotations, and to serve your archive as a desktop-class web application.

HPC has worked with Pandora software with the support of the Heidelberg Research Architecture team and the objective would be to allow Pandora to be an optional video repository, using its API, to YouTube.