Archive for category Proceedings of THATCamp

Looking back on THATCamp Texas

I’m still relishing the conversations I took part in at THATCamp Texas in Houston last month. Some of my initial thoughts on the perks of this interdisciplinary meet-up can be found in Natalie Houston’s ProfHacker column (bit.ly/gf2yRx). She included several of the participants’ feedback, thus crowd sourcing coverage of the events in a manner quite fitting to the THATCamp spirit. In the current posting, I relate some additional aspects of THATCamp Texas that stood out for me.

The first of the two days consisted of Bootcamp sessions. I attended five in the “Creating and Managing Digital Projects Track.”  Here’s a brief overview of each:

1) Building Digital Collections with Omeka

This session, run by Amanda Focke, gave an overview of how to gather, tag, and present images of documents and artifacts with the web exhibition software Omeka. Participants had hands-on practice inputting items, seeing how the Dublin Core tagging system worked, and learning about various plug-ins. We were able to check each other’s preliminary entries as well as examining the finished collections that Rice University’s library currently uses Omeka to curate.

2) Building a Web Presence with WordPress

Even if you have experience with blogging, there is a lot to be found in WordPress that might be surprising. This session, run by Chris Pound, covered how to set up and register a unique domain name, discussed differences between blogging and other more stable webpage hierarchies, showed how WordPress varied between the open source version that you host on your own and the version hosted through the WordPress servers, and ran through many of the user-controlled settings.

3) Introduction to Producing Electronic Texts Using the Text Encoding Initiative

Though I’d done a workshop involving TEI tags a couple years ago, this one was an improvement in several ways. Crucially, this session, run by Lisa Spiro, provided not just training in software, but also offered a useful overview of the history and rational for TEI, focusing in the second half on Oxygen as one way to encode this information. Furthermore, Oxygen itself had changed since I’d looked at it before (there are many improvements from version 9.2 to 12.1). Also, as I found throughout the conference, the use of Twitter was a real perk, as comments from other participants during the session and later on provided leads to other text editors that are available and let us compare our experiences validating sections of code.

4) Managing Scholarly Digital Projects from Start to Finish

This title sounds like a lot to cover in an hour, but it was a very practical session. Andrew Torget talked about the value of finding individual grants to cover sub-parts of your project rather than expecting one grant to pay for everything; planning in a stage where you refine your prototype; the long-term advantages of using open source software; and working to get the word out on a project as early and often as possible. He illustrated these themes with details from many completed digital projects that he’s been a part of.

5) Using regular expressions to match and manipulate text strings

This was the most technical of the sessions I attended, but it was clear and accessible and has proven to be very helpful. I’ve read about and even taught preliminary lessons on regular expressions before, but this session gave a super useful framework for the range of places where regex use shows up and how they could be advantageous in many projects. We tried out several exercises to capture text patterns through different regex combinations, and were left with a number of good links to online resources.

The second day was the actual “unconference” sessions of THATCamp. I attended three sessions as well as the energetic joint scheduling discussion and the dorkshorts presentations. The main sessions that I took part in were on Productivity, Crowd Sourcing, and Text Mining.  And I truly wished to have a time-turner like Hermione Granger’s so that I could have attended several other interesting-looking sessions that were occurring at the same time. I addressed the crowd sourcing and text mining sessions in the Profhacker post. Both of those also resulted in group-edited Googledocs write-ups as well. (See bit.ly/hi2sUY and bit.ly/lRuJlVrespectively). So I’ll mention here just the productivity session, facilitated by Natalie Houston. This discussion offered a good chance for participants to compare the difficulties they found in juggling scholarly projects, other academic obligations, and personal life. The conversation produced both high-tech and no-tech suggestions for ways we might clear the deck in order to focus on our chosen tasks more productively. And it realistically showed how no single solution should be expected to fit every scenario.

How will I apply the things I’ve learned? At the moment, I’m working on a long-term corpus project where Andrew’s reminders of how to manage the workflow and think about funding stages of the project is proving to be immediately useful.  The TEI workshop and regex session have inspired me to standardize and update the format of some other data that I’m working that has an upcoming publication deadline. And frankly, following a dozen new people on Twitter whose work is interesting and inspiring is helping me daily to connect to a larger network of digital humanities scholars. In the long term, I have hopes to attend several future THATCamps. And as a bonus experience, I’ve been in contact with other Texas campers from three local universities. We are starting talks about coordinating another THATCamp ourselves up in our end of Texas in the near future.

Using GIS to Visualize Historical and Cultural Change

I am interested in discussing how GIS mapping technology can help visualize cultural transformation in specific communities. Ideally, I would be able to show this change at the local and international border levels. My dissertation research compares the development of Mexican American transborder communities on the Texas-Mexico border with Franco American transborder communities on the Maine-Canada border. I focus on intermarriage and language practices at the turn of the twentieth century. I have some experience using GIS mapping technology in the classroom through creating interactive mapping activities (U.S. Southwest module of sacarcims.sac.alamo.edu/default.htm) and in conjunction with service-learning projects. Most recently, I have used it to create maps to illustrate my research.

 

I am currently working with census data and hope to learn new ways of visualizing information from a variety of sources:

* I am using census data to track intermarriage based on nativity, how language practices changed over time, and gender differences in those practices. At this point, my maps reflect the locations of towns, the growth of railroads, and act as backdrops for pie charts.

* I would like to learn new ways to use GIS to visualize changes in language practices (who spoke French where and when) using census data, the distribution of French/Spanish language newspapers, photographs and/or distribution of public signage, and the impact of school language policies

* I would like to find new ways to visualize intermarriage practices, if possible.

* I am also intensely curious about possible ways to visualize migration and settlement patterns. On the international level, I would like to show changes in border crossing traffic in response to stricter immigration policies and border enforcement. This could include points where border crossing stations or international bridges appeared, and hopefully more. At the city level, I would like to see how the ethnic makeup of town neighborhoods and rural areas may have changed. I’ve seen where later twentieth century census data can be mapped to a detailed local level. I’d like to do the same with data from the 1860s to 1930s – and still hopefully be able to finish my dissertation before the turn of the next century.

 

These are some of my initial ideas and I am completely open to suggestions. I look forward to discussing your ideas and projects. Thank you.

 

Increasing Proprietary Database Literacy

Looking forward to meeting you all! The posts so far have been really exciting.

One of my ideas for a session is similar to Matt King’s post about procedural literacy and Jessica Murphy’s post about theorizing digital archives for graduate students. As I’ve just explained in a longer post on my own blog, many historians in my own field–the history of the early republic–have begun to use proprietary databases like those published by ProQuest and Readex as crucial parts of their research process. The evidence of this is beginning to trickle down into the scholarship published in leading journals in our field; my longer post gives a few examples.

While I am personally interested in how methods like text mining and keyword searching might be deployed in my own research, I also think the increasing use of such methods will require all historians (and I would extend this to humanists generally) to keep up to speed with differences between major proprietary databases. To evaluate, and also to write, the kinds of articles that are appearing now, I think we need an easier way to see, at a glance, what the default search conventions are in different databases (e.g., whether the text layers in these databases are created with OCR or other means, how often databases are changed, how big the databases are, and so on). What I’m imagining is something like a SHERPA/Romeo site that serves as an accessible and human-readable repository of information about proprietary databases used in humanities research.

The questions I have related to this idea are: Do similar sites already exist? Would such a site be useful? What sort of information should it include to be useful? What features (search, sorting) would make the site most useful? What costs and problems would be involved in building such a site? Would it be best housed in existing professional organizations, or cross-disciplinary? Should it be wiki-like, or maintained by a few authors? What funding would be required, and where might it be found? Could scripts or RSS feeds be used to keep the information up to date? What legal issues would be involved? Are there other, better means of helping humanities scholars (even those, like myself, who are on the margins of or new to “digital humanities” proper) abreast of relevant information about proprietary databases?

Alternatively, could many of the same needs be met by developing a “manual of style” for humanists who wish to cite the results of keyword searches in proprietary databases? How rich should the information included in such citations be and how should it be formatted? Could we collectively draw up such a “style manual” for keyword searching at THATCamp?

My other idea for a session deals more with my teaching interests. I’m currently working with undergraduate students in my Civil War history class to build an Omeka site and would be interested in learning from others about their experiences with digital project management in a classroom setting.

Procedural Rhetorics, Procedural Literacy

Procedural literacy typically involves a critical attention to the computational processes at work in digital artifacts. Our understanding of a web page shifts if we consider it not only as a multimedia and hyperlinked text but also as a rendering of code that normally remains hidden from us. Ian Bogost argues that procedural literacy need not be limited to computational processes, that this mode of literacy encourages a more general capacity for mapping and reconfiguring systems of processes, logics, and rules. This expansive sense of procedural literacy resonates with James Paul Gee’s investment in “active learning,” an approach to education that emphasizes social practices rather than content as a static entity. Both procedural literacy and active learning highlight the importance of engaging texts (broadly defined) as embodiments of dynamic processes and configurations. Procedural rhetoric more specifically refers to the way that a text can be expressive and persuasive with reference to the procedures it embodies (Bogost privileges video games as examples of procedural rhetorics).

I would be interested in a session that considers the possibilities for teaching procedural literacy and procedural rhetorics as well as incorporating them into scholarly work. Areas of inquiry like critical code studies and video game studies would be one possible focus, but I imagine that the session could be more inclusive and expansive. For example, “digging into data” projects seem to require procedural literacy to establish algorithms through which to read texts. An algorithm functions as a sort of procedural argument: “this is a valid and helpful way to reconfigure these texts.” A recent article argued for reading David Simon’s The Wire as a sort of video game, a show deeply invested in attending to the logics and processes defining Baltimore’s drug trade and various institutional responses to it. In this sense, procedurality might be a useful concept for areas of inquiry that take us outside of the digital humanities proper.

My own interests have led me to focus on the intersection of rhetoric and video games (see the Digital Writing and Research Lab’s Rhetorical Peaks project), but I would be very interested to hear how others incorporate notions of procedurality, procedural literacy, and procedural rhetoric into their research and pedagogy.

Tags: , , ,

Bringing DH to the LAM World

I would like to propose a session about how people are forging fruitful partnerships between DH (digital humanities) initiatives and the world of LAMs (libraries, archives, and museums).

In my own experiences in the LAM world, I have witnessed many opportunities for symbiotic partnerships between the two go unexplored.  At museums in particular, many important cultural heritage collections remain hidden, due to lack of technological infrastructure, as well as fears about treading into new policy territory, exhausting resources, transgressing museum traditions, or ceding control of collections by making information available online.

Many museum collections are cultural heritage treasure troves and could become incredibly powerful scholarly resources if combined with DH tools and strategies like linked data and information visualization.  Additionally, museum professionals have great expertise to offer in the way of understanding and serving users, as well as organizing and presenting visual information. There exists a growing contingent of technology-friendly professionals within the greater museum community, but many of them work for larger, more generously funded institutions like the Smithsonian, or they are working on finite, grant-funded projects. At museum conferences, too many of the conversations focus on “making the case” for broader technology implementation to policy-makers, as opposed to actually implementing powerful digital collections solutions.

If LAMs were more routinely and directly engaged with the DH community, and more dialogue focused on the goal of sharing resources and combining available and developing DH tools with long-standing LAM knowledge, expertise, and traditions, I sense that both communities of practice would be benefited.

I would love to hear about other people’s experiences working at the intersection of DH and LAM practices, and to gain new insights into how to bring the two closer together.

Looking forward to meeting you all!

Identifying and Motivating Citizen X-ists

I’ve got several session ideas rattling around my head.  I doubt I could talk about any of them for more than 20 minutes, but if one of them fits well with another THATCamper’s interests, perhaps we can put a session together.

The last year or so has seen a lot of buzz about Citizen Scientists, Citizen Archivists, and many yet-unlabeled communities of people who volunteer their Serious Leisure time collaborating with institutions and each other to produce and enhance scholarship.  Institutions are becoming interested in engaging that public via their own on-line presences and harnessing public enthusiasm to perform costly tasks, spread the word about the institution, and enhance their understanding of their own collections.  Less well understood is the difficulty of finding those passionate volunteers and the nuances of keeping volunteers motivated.

I’ve been blogging about crowd-sourcing within my own niche (manuscript transcription) for a few years, and one of the subjects I’ve tracked is the varying assumptions about volunteer motivation built into different tools. Some applications (Digitalkoot) rely entirely on game-like features as incentives, while others (uScript, VeleHanden) enforce a rigid accounting scheme.  There is a real trade-off between these extrinsic motivations and the intrinsic forces that keep volunteers participating in projects like Wikisource or Van Papier Naar Digitaal, and project managers run the risk of de-motivating their volunteers.  Very few projects (OldWeather and USGS’s Bird Phenology Program among them) have balanced these well, but those have seen amazing results.

As a software developer my focus has been on the features of a web application, but finding volunteer communities to use the applications is equally important.  I’ve got a few ideas about what makes a successful on-line volunteer project but I’d love to hear from people from different backgrounds who have more experience in both on-line and real-world outreach.

Engaging the public.

Recently I attended the OAH conference in Houston. One of the sessions, “Texas Textbook Controversy” (which I live-tweeted: twitter.com/#!/search/txtxtbk) continually returned to the topic of engaging the public in what historians do.

For Example, here are three of the tweets I made that quoted @historianess:
.@historianess We need to engage the public in what we do, that the way we think about the past is constantly changing.#OAH2011 #TXTXTBK

.@historianess We don’t do a terribly good job of engaging the public. #OAH2011 #TXTXTBK

.@historianess We as a profession…need to be a lot more open about what we do. #OAH2011 #TXTXTBK

My idea for a session proposal would be to have an open dialogue about how we can use public-friendly digital technology – ie, twitter, tumblr, etc. to engage the public in what we do professionally. This could involve lots of different methods. Something that would coincide with the OAH session’s emphasis on interaction between higher education (historians specifically) and the elementary and secondary teachers might involve integrating lesson plans (and educational standards) into a department’s current research projects and vice versa. Several museums and websites do a great job of this by presenting information for teachers to use in creating lessons, however, there is very little interaction taking place – and therefore – very little exchange of ideas or engagement with the public.
I admit that I only have a few ideas about implementing this. And, even fewer specific goals that would be considered measurable objectives. However, I think this is a worthwhile discussion to have, and that I, and others, could learn from the exchange.
A final thought: considering the challenges facing many departments with funding, I think we miss a great opportunity to gain public support for our profession (including missing an opportunity to encourage future scholars into our fields) by failing to engage the public. Considering the ease of many sites online, and considering that many of these sites are free, it appears a real waste for departments (and professionals) to not take advantage of them. While this may seem obvious to those of us that applied to THAT Camp (we are likely to be biased towards using digital means already), perhaps we can gain further insight from one another about how to engage the public and which methods are most advantageous.
Thoughts?

Combining Text-Mining and Visualization

I’d like to propose a session on getting the most out of text-mining historical documents through visualizations.  There has been a lot of attention recently lavished (rightfully, for the most part) on Google’s n-gram tool and the recent Science article.  And text-mining has been gaining a lot of attention from humanists, particularly as easily adopted new tools and programs become available.

I’m working on two big projects that try to extract meaningful patterns from large collections (newspapers in one, transcribed manuscripts in another) and then make sense of those patterns through visualizations.  Most of this happens in the form of mapping (geography and time being the two most common threads in these sources), but also in other forms of graphing and visualizations (word clouds, for instance).

A major challenge, it seems to me, is that there is not a widely understood common vocabulary for how to visualize large-scale language patterns.  How, for example, do you visualize the most commonly used words in a particular historical newspaper as they spread out across both time and space simultaneously?

We’ve been experimenting with that in our projects, but I’d like to hash this issue out with folks working on similar (or not so similar!) problems.

Skip to toolbar