Working on the Visionary Cross project never ceases to amaze and provide more learning opportunities for me and at the same time testing the limits for systems designed and made available to research community. If last year was dedicated to figuring it out as to how data and research outputs will be disseminated so that these can be accessed by maximum audience (both targeted and non-targeted), this year is starting with implementation of those decisions.
But somethings really never change, we are getting mixed results from working with Zenodo. It was easy and straight forward affair to get things uploaded and published on Zenodo. We already have RAW PHOTOGRAPHY uploaded (as I talked in my last post) for both the ‘RUTHWELL CROSS‘ and the ‘BRUSSELS CROSS‘, complete with DOIs assignment. That was the success part!!!
But at the same time, these uploads have not been approved to be published on the Visionary Cross project page on Zenodo, as Zenodo previewer is not able to get the preview of the .zip file. We are in talks with their tech team to resolve this issue along with the second issue of non-availability of 3D previewer in Zenodo.
It sure will be an year of learning and developing.
PS: Click the hyperlinked words to reach community page and access complete raw photography.
The Visionary Cross project is an extensible, multi-object, multi-media edition of a Cultural Matrix in Anglo-Saxon England. It is built around mediated representations of sculpture, buildings, and text. It employs XML transcriptions, high resolution 2D photography, 3D laser scans, 3D photogrammetry, and a socially focussed game engine. The project is about both the objects it includes (several of which are among the most studied from the period) and the relationships among them. The edition is intended to appeal to scholars and the interested public. It is also intended to be extensible: we hope that others will want to use our material in ways that we do not anticipate in the context of other collections and approaches to these cultural objects, including commercial and tourism applications.
When we first began planning this edition, we devoted most of our attention to questions of user interface. How would users interact with our objects? What kind of metaphors could we use to indicate the close relationship among the different components? Did we need to use a game engine? Did we need to design a special viewer? What kind of interface could we use that would contain 3D objects, 2D text and images, and, potentially, a navigable representation of location in which these different objects were contained?
As our project has progressed, however, we have gradually begun to reverse our assumptions. A project like the Visionary Cross, we believe, is less about the interface than the quality of its data and the usefulness of its API. Influenced especially by the work of Peter Boot and Joris van Zundert (Boot and van Zundert 2011; van Zundert 2012), we have come to see our own primary role to be the providers of a service rather than a self-contained scholarly object.
This change in approach has manifested itself in several ways.
In the first place, we no longer worry about interconnectivity among our several “views” of the raw material. The Visionary Cross team involves several sub-projects, led by groups of scholars with very different interests in the raw material of our edition. While we were initially concerned about enforcing common standards among these different sub-projects as a means ensuring interoperability, we new see the sub-projects as independent use-cases, developed by members of the project team, but otherwise similar to what we hope will be developed by independent users once the project as a whole is released.
In the second place, we are now also much more aware of the one-off uses proposed by external users. Early on in the project’s history, we found ourselves being “distracted” by demands on our time by external users who wanted us to develop instances of our material for their personal use: views of specific panels on our crosses or pages from our manuscripts; models that users could use to develop physical facsimiles; images and details from our object for use by archaeological authorities or, for tourism purposes, by the institutions who control the original objects. As a result of our change in approach, we now see these requests as a core use of the project as a whole: a truly extensible project, we believe, must be open to use in the ways that users want, not only those anticipated by the project designers.
The biggest manifestation of this change in approach, however, has been the massive simplification of our plans for publication. Where we originally planned on a heavy, self-contained interface for the edition’s multiple views, objects, and approaches, we are now working with a model inspired by the University of Pennsylvania’s OPenn project for a minimalist representation. Individual objects will be published to the web as research objects in their own right, presented in a format that will allow easy, no-frills access to views, downloadable files, and scholarly discussion. But we will not, on the whole, attempt to construct a single interface to guide readers’ interactions with the group. And we will be including among our files formats suitable for use in rapid manufacturing and the development of tourism applications.
The Visionary Cross project was initially conceived of as a massively integrative scholarly edition: an edition that would not only edit the objects in our collection, but also the connections among them. As we have acquired the objects and begun to think more concretely about publication, however, we have come to believe that the future of the scholarly digital edition lies in fact in its atomisation: the provision of minimal and “just-in-time” interfaces and file formats that allow others to work with our material in ways that suit their needs. This paper is about the choices and tradeoffs we are making in pursuing this new, minimalist approach.
Boot, Peter, and Joris van Zundert. 2011. “The Digital Edition 2.0 and The Digital Library: Services, not Resources.” In Digitale Edition und Forschungsbibliothek Beiträge der Fachtagung im Philosophicum der Universität Mainz am 13. und 14. Januar 2011, edited by Christiane Fritze. Wiesbaden: Harrassowitz.
van Zundert, Joris. 2012. “If You Build It, Will We Come? Large Scale Digital Infrastructures as a Dead End for Digital Humanities.” Hungarian Studies Review: HSR 37 (3): 165–86. http://0-search.ebscohost.com.darius.uleth.ca/login.aspx?direct=true&db=sih&AN=77596210&site=ehost-live&scope=site.
This is supposed to be done before the new year begins but past year did not end as well as I expected due to some serious sickness. But still we got quiet a lot done at the Visionary Cross.
For me as researcher and project manager, year 2016 has been quiet good. We were finally able to decide upon our publication platform for dissemination of raw as well as processed data from this project and are working towards publishing it (not all, just the raw photography) by the end of the end. The work done by MITACS student Ms Fan Yan under my co-supervision took shape of presentation. For me it was amazing to present my experiences and thoughts about managing heterogeneous data as well as a multidisciplinary team of really competent researchers: ‘Metadata, Paradata and Standards: Management Challenges in 3D Scholarly Edition Project’ at NEH funded ‘Advanced Challenges in Theory and Practice in 3D Modeling of Cultural Heritage Sites’.
Looking forward to healthier and more innovative year ahead.
I see with horror that it’s been a year since we last posted something to the Visionary Cross site!
This is not due to lack of work, as there has been plenty going on behind the scenes. Rather it has been the result of a series of unfortunate circumstances (especially a series of student sick-leaves) and the nature of the work itself, which has been a little more difficult to write up in small pieces than was true in 2015.
Our main work for 2016 has been developing the infrastructure to follow up on our December 2015 posting on the Visionary Cross data model. As we mention there, the Visionary Cross project has always seen itself intellectually as a curated data set and expressed allegiance to the kind of “just-in-time” editing advocated for by Joris van Zundert and others (see Boot and van Zundert 2011 and van Zundert 2012 below). But we had not, until our December 2015 meeting in Lethbridge, really understood the implications of that understanding in terms of what we might call “the craft of edition-making.” As I mentioned in my posting on the Lethbridge meeting, we had always understood the Digital Library parts of our project as being a question of system–D-Space vs. Omeka vs. Greenstone–and not really recognised the extent to which those systems were really secondary questions to metadata and organisation: “left hand” issues in the terminology of our meeting.
Investigating data publication: OPenn Pros and Cons
The model we began to use instead was that of OPenn, the new and minimalist Digital Library/Repository published by the University of Pennsylvania library. And in fact we spent most of the Spring looking at what would be required to get our data into OPenn: working out metadata standards required and, especially, thinking through the nature of our objects and their relationship to each other.
In the course of the Spring, however, we began to realise that OPenn was a good model, but not a great solution, for the particular needs of the Visionary Cross project. In particular, we ran into two main issues that led us into looking at ways of emulating rather than joining the Pennsylvania model:
- OPenn is organised around physical repositories and can’t easily handle virtual collections;
- OPenn is a system that requires negotiation to join.
The first case is an interesting mismatch: OPenn was designed to showcase collections from repositories. The unstated assumption behind this is, firstly, that the poster is the owner of the collection and is able to speak for that repository; and, secondly, that the collection is best organised by repository.
In our case, however, we are researchers rather than repository owners, and our material is both single objects from external repositories and something that gains meaning from their cross-repository relationships with each other. We felt quite reluctant to propose repositories to OPenn (and behave as the owners of these) for external parties like the Cathedral library in Vercelli or the Ruthwell and Bewcastle churches, especially when we needed to establish these repositories to hold single objects we were using in the context of a cross-repository collection. Let’s say, for example, we’d also been using a page from a manuscript in the British Library–a repository not represented in OPenn at the moment: would we then establish the British Library node on their behalf?
This then leads us to the second point: the degree to which OPenn requires negotiation to join. Our vision for the Visionary Cross project is for a dataverse of objects that can be used and added to by anybody. Placing something in OPenn requires the agreement of OPenn and, potentially, the physical repository itself. To establish or add to a repository named for the British Library, for example, presumably requires the permission of the British library. And it also requires you to agree to the terms mandated by OPenn–with its very open licence. In our case, however, we are also working with material that has different levels of openness. While the data we produce is available CC-BY, some of the data we use is under much more restrictive licencing: we still need to be able to “include” this data in some way (i.e. have it listed as part of our virtual collection), without forcing a more open licence on it than its owners are prepared to give.
Catch and release: Zenodo? Github? Some other system or combination?
Our work in the late Spring and Summer, therefore, involved investigating other ways of collecting and publishing data for our project. Our requirements during this stage were:
- The system should be as simple (and as much as possible compatible with) OPenn;
- Participants should be able to use, contribute, and organise to the collection without negotation;
- The system should be open to multiple virtual organisations (i.e. not repository-based);
- The system should be agnostic as to licencing, data formats, and so on (while recognising that some contributions may not be eligible for participation in ).
(We wrote up a version of this as an abstract for the Digital Scholarly Editions workshop in Graz: you can see the details here).
In the end, we decided that the real solution to this problem was to have no system at all. To instead focus entirely on making data as discoverable and well-documented as possible, but to avoid requiring others to join our system in order to participate or contribute to the collection.
In the course of the summer, therefore, we began searching for systems that would provide long-term, non-negotiated accessibility to our data and metadata and discoverability standards that would support non-negotiated access and reuse. After investigating several options, including the University of Lethbridge’s Institutional Repository, Figshare, arXiv, and Github, we decided to go with Zenodo, a repository hosted by CERN for the European Union and dedicated to the open distribution of scientific data. We have recently established a Zenodo “Community” for our data (https://zenodo.org/communities/visionarycrossproject/), and expect to start publishing our first datasets early in the new year.
Yesterday, Dot Porter, one of the leads on the Visionary Cross project visited Lethbridge for a project meeting (and to speak to my DH class). The main purpose of her meeting was to plan the work that needs to happen on the Digital Library side of the project.
This is a core issue for us. As we say on the front page:
The Visionary Cross project is an international, multidisciplinary research project whose principle objective is the development of a new kind of digital archive and edition of texts and objects associated with the Visionary Cross tradition in Anglo-Saxon England.
Taking its cue from recent developments in digital editorial theory and practice, the project takes a data-centric, distributed, and generalisable approach to the representation of cultural heritage texts, objects, and contexts in order to encourage broad scholarly and popular engagement with its material.
The important things here are that it is an archive-edition: it is data-centric, distributed, and (supposed to) be designed to encourage broad scholarly reuse and popular engagement. In our thinking on this, we have been very influenced by the work of Peter Boot and Joris van Zundert on “Just in Time editing” or, especially, the edition as service. In other words, we have understood our work to be not primarily an interface, but rather a service: a source of mediated primary material. This is in keeping with the philosophy of my edition of Caedmon’s Hymn, where the emphasis was on exploiting the power of existing web protocols, infrastructure, standards, and services, rather than custom programming.
In practice, however, this has proved to be something of a block in our progress. Since the very beginning, the Visionary Cross project has approached the problem of editing its objects as an intellectual market place. We’ve had several teams in place who have been working with our common material in different ways: as a Serious Game, as a student-centred resource, as raw material for humanities research, as part of a different edition of a related collection. In each case, the participants have been working alongside each other, rather than directly in collaboration or cooperation, in part because the thing the project has been leveraging has been the overlap in their enthusiasm and their interest in the common dataset. We’ve wanted people to want to share resources because they see how this sharing allows them to do their own research in the directions that appeal to them, rather than to try and bend their interests towards a common, lowest-common-denominator consensus-focussed single interface.
We began this way initially for funding reasons: we didn’t have enough (our first grant awarded us only 25% of our ask) and the only way of getting any work done was to tie our project to the interests of its participants as they worked on other things.
But over the years we began to see this as a virtue as well. By the time we did get all the funding we asked for, this 百花齊放，百家爭鳴 (Let a hundred flowers bloom, let a hundred schools of thought contend) approach had become a part of the goal of the project: we now wanted to exemplify the way we thought our project should be used by others in our internal workings (and, of course, byt the time the funding arrived, we were committed to our different streams anyway).
The downside to this approach, however, has been that it has proved to be difficult to manage: the sub-projects are themselves quite different from each other (though with at times considerable overlap in some aspects) and the result has been that it has been difficult to do the common work. It has also, in some cases, led to minor friction: overlap can, after all, look a bit like competition. As individual projects work on their interfaces, navigation, content, and the like, there’s been little incentive to pay attention to the common aspects of our work; and more importantly, preparing content (objects, annotation, approaches, etc.) has generally involved customised work for specific sub-projects: instead of developing a core set of intellectual objects (metadata, annotation, etc.), we’ve basically had different groups adding custom intellectual objects to a limited set of common core facsimiles (i.e. the 3D models, photography, and, to a limited extent transcriptions.
This is both why we were having the meeting yesterday and why its results were so important. The goal of the meeting was to lay down the ground work for building the central Digital Library that would allow us to build an appropriate place for projects to feed back into the common body of objects and to provide a place where generalisable scholarship and mediation could be done: i.e. a place where we could develop common metadata, commentary, annotation, and the like that could be then used to distribute to the sub-projects
The result was the following diagram:
The way to read this is that we currently have the situation at the far left and far right. I.e. as at the left, we have a lot of use-cases for our data–a Serious Game, a student-focussed reader, some work on a scholarly edition. And as at the far right, we have a collection of files: raw files, processed files, working copies, etc., all organised by origin (i.e. a dump of the different drives, cameras, scanners, and so on). What we don’t have, is the middle: an object-organised collection of objects, metadata, and intellectual objects that can serve as an engine for the projects at the left. And this is where our problems are coming from: since we are missing that engine, the sub-projects are developing their own collections of intellectual objects and processed files.
Initially, it was as a middle that I thought we needed a digital library application. I.e. that the solution would be to set up an Omeka or DSpace, or ContentDM installation and put our stuff in it. But we were hanging up on the choice of software: was Omeka better or worse than Greenstone for this? how would the interface look? and so on.
What we realised yesterday, however, was that these were actually implementation (i.e. left-hand) issues, more or less the same as the questions about our various viewers and environments. That if we really saw the core of the edition as a curated collection of intellectual objects that were intended primarily for reuse by others, then we needed to focus entirely on the data–providing a simple, easily understood, open, and extremely robust collection that could be then used by others for any sort of other purpose, including putting in a Digital Library Application.
A model for this is OPenn. This is an example of the “digital library” in its most simple form: as a well thought out and curated and minimally skinned series of directories with predictable content and naming conventions… and nothing else. As Dot has shown in her own subsequent work, this approach is in fact extremely powerful: it is easy to add to (I was in Penn for the launch of OPenn this Spring, and it has already grown rapidly in the intervening months to include collections from other collections in the Philadelphia area); and it is easy to use for added-value projects: Dot has used this system to build eBooks, page-turning software, a searchable digital library, and so on.
Moreover, as Dot showed in one of her two lectures yesterday, OPenn originated from what was actually a similar problem: a collection that existed in what we are describing here as a “left-hand” format without a corresponding “middle” (she also indicated that they might have had a similar “right hand” problem as well, but that’s not important for us at the moment). The Schoenberg Institute at the University of Pennsylvania was created to “to bring manuscript culture, modern technology and people together to bring access to and understanding of our cultural heritage locally and around the world.” Penn itself began a digitisation programme as early as the late 1990s, and, I believe, has now fully digitised and released its manuscript collection under a Creative Commons licence (CC-0, in fact). Like many libraries, it then released this material to the public using a “page turning” interface that allows users to “read” the manuscripts online in much the same way they would the actual codex (this is an interface design loved by museum and library directors, and, reportedly, the general public, but hated by most scholars who work with manuscripts).
The problem, however, was that apart from reading online, there was not much one could do with the material. It was possible to download individual manuscript leaves (and, if you worked at it, all the leaves in a given manuscript, page by page). There was also largely untyped output from the library MARC records for each manuscript that could, as far as I can see, be scraped, if you wanted to use it. But there was no easy way of accessing the resource for other reasons or to repurpose the material for other applications.
The solution to this was to develop OPenn. This is, in essence, a refactoring of Penn-in-hand in its absolutely most simple form: as a directory structure in which objects and associated metadata are grouped together. Each level in the directory can be browsed by a human as a web-page with links and images (a system so simple that, apart from the browser-side XLST processing that is going on, there is nothing that couldn’t be accessed via Netscape Navigator or maybe even Lynx). But more importantly, each level can also be accessed by utilities like wget (allowing you to download entirely collections programmatically) or by URL (allowing you to address inidividual files externally). There are no bells and whistles here (there’s not even a search function, though as Dot showed, you can build one in viewshare). There is nothing to maintain, other than paying for the server and your ISP. Directories and files are not even core Internet architecture, they are core computing architecture and not going anywhere any time soon.
But the important thing is that, by reducing their DL to its absolute most simple form, OPenn makes it easier to design almost everything else a user might want. In addition to the search interface, Dot showed us how she had then used external systems and libraries to build different ways of accessing the Schoenberg material–as eBooks, for example, or even through a page turning interface library. In other words, by massively–you’re tempted to think almost irresponsibly–simplifying the core service of their project, OPenn is able to do everything Penn-in-Hand does, and much much more easily.
So this, I think, is where we have to go with the Visionary Cross. To focus on the core aspects of our content–metadata, curation, content, and high quality objects–and present this to potential users (and our various subprojects) in as simple a fashion as possible: in a way that focusses purely on making the data available in a responsible fashion and ignores all questions of interface, tools, and other things we commonly consider when we think of “digital editions,” in order to do a good job of delivering the data in a form that others can use more easily for their own projects.