DCW Volume 1 Issue 11 – D3, New SEO, and “My Archive”

by Korey Jackson on August 10, 2012

Writing in 3D with D3: Data Driven Documents
Matt Burton

I spent an afternoon the other day finally digging into a very interesting JavaScript library (library in the programming sense as in collection of code available for re-use) for creating data visualizations. The library is called D3.js or just D3, which stands for Data Driven Documents. Perhaps it is the caffeine talking but D3, I think, is going to transform the way in which I produce scholarship.

From a technical perspective D3 is a JavaScript library for transforming data into webpages. D3 takes very seriously the affordances of the web, and more specifically the browser, and even more specifically the DOM or Document Object Model (the formal representation of web pages inside a web browser), as a medium for expression. D3 provides a set of functions for ingesting data from a URL and then manipulating the DOM to produce various representations of the data (charts, graphs, lists, interactive maps, etc). One analogy might be to consider the browser window like a sheet of paper. On paper you might use a pen or pencil to produce symbolic representations of ideas (words, drawings, doodles etc). When working with paper, there are a multitude of tools to make certain kinds of expression easier. You can, theoretically, draw a straight line with just a pencil but it becomes a lot easier with the aid of a ruler. D3 is a toolbox filled with the digital equivalent of rulers and pens and pencils and compasses and a profusion other tools. However, D3 does assume a lot of knowledge about HTML, CSS styling, javascript programming, and data cleaning & processing; it is a power users tool.

D3 is interesting not only because it is a tool for creating data driven documents, but D3 is a kind of scholarship itself. The tool was created by Mike Bostock while he was working on his PhD in computer science at Stanford. D3 is a multi-modal scholarly product, albeit one that required formal accounting in the form of a conference paper. The paper, however, is not the real scholarly contribution, the JavaScript library is what is really valuable. This style of scholarship, the tool + conference paper, is very common in the visualization and Human Computer Interaction worlds, and could be a model for some kinds of scholarship in the digital humanities (but I must emphasize only SOME forms of scholarship in DH).

Assuming you have the requisite technical skills, I recommend Getting Started with D3, but this book will only wet your appetite. Next, read through Scott Murray’s excellent D3 tutorials (Scott stopped writing D3 tutorials and is now writing a book on data visualization & D3). Finally, visit the D3 website for it is chock full of beautiful examples and well written documentation.

D3 is very cool and has a growing community of industrial data scientists (Bostock is now working for Square, data journalists, hard scientists, and even a couple digital humanists here and here. I am especially excited to see what kinds of scholarship digital humanists produce as they begin thinking in terms of “data” (for good and for bad) driven digital scholarship.


The Return of Content?
Korey Jackson

Yaron Galai of Outbrain and AdAge.com offers this optimistic report about Google’s Penguin algorithm and the rise of “New SEO.” If the old version of SEO was about baiting crawlers and link-selling, says Galai, Google’s new indexing methods are much more aligned with what “white hat” web publishers have always been about: producing good content.

In the recent past, the sites that often ended up on top of the Google rankings pile were those that engaged in “black hat” SEO strategies: larding up content with fraudulent links, anchor tags, etc., and engaging in shady link exchange practices (or buying site hits en masse)…and simply dumbing down content to ensure keyword optimization.

With this latest revision to Google’s algorithm, the hope (at least for legitimate publishers) is that we’ll see a return to rankings based on actual sharing, actual content, and actual quality. Of course, any algorithm can be gamed. The thing is to make gaming at least as time-consuming and expensive as good content production.

The question for DCW readers, then, is: how can scholarly web publishers and academic bloggers ensure that their own content is being indexed well? “White hat” doesn’t mean “total lack of SEO strategy.” And there’s a tendency in the scholarly community to see any strategy that smacks of commerce as a bad thing—or at least as a something not in keeping with the ideal of pure knowledge dissemination (i.e. “I’m in art history, not used car sales”). So what are you doing to bring readers to your site (aside from writing well and often)? Do you think about SEO? Or is it even the job of the digital scholar to take algorithmic stratagies into consideration?


What’s Your Archive?
Edward Whitley

Archival research is hot. A lot hotter than it was when I was an undergraduate English major in the early 1990s and Big Theory had not yet worn out its welcome. Back then, the scholars who attracted the most attention were those who had mastered the occult knowledge of postmodern theory and could apply it to literary texts with a performative flair that left the rest of us wondering if we’d even be able to imitate their astounding feats. Today, the scholars that fill the conference rooms at academic conventions are the ones who have found obscure documents in musty archives and can skillfully weave these little-known texts into a larger narrative about literature and culture. As archival research has grown in prestige and authority, I’ve heard scholars use phrases like “my archive” to describe the body of primary sources they’re working with; I’ve also heard them ask one another, “What’s your archive?” as they discuss their work in progress.

I don’t think my colleagues are referring to specific brick-and-mortar structures like the American Antiquarian Society or the New York Public Library when they talk about “my archive.” What I think they’re trying to capture is a sense of the parameters that they apply as they search through a variety of archival holdings, both digital and physical: “my archive” is broadsheet ballads from the antebellum period; “my archive” is newspaper poetry; “my archive” is the testimonies of former slaves. Personally, I love the construction “my archive.” Maybe it’s because I’m a sucker for good archival research. I love the digital photos of rare documents glowing in pre-publication glory that scholars share at conferences. I love the moment in a scholarly essay when generations worth of common knowledge about a text or an author is thrown out the window by a single rare document, smouldering there on the page like the smoking gun that it is.

As archival research has replated Big Theory as a source of prestige and authority in the academy, archives have come to play the gatekeeping role that theory once played as well. Finding something juicy in an archive has replaced, in many ways, the performative application of postmodern theory to a single literary text. What role do digital archives play in this transition from theory to primary research? Do digital archives of otherwise hard-to-find primary sources democratize the prestige and authority conferred by archival research? Or has this gatekeeping function reasserted itself in other ways? Now that more and more material is available online, is there a certain caché to finding something available only in a brick-and-mortar archive? Is greater prestige assigned to winning a fellowship at places like the American Antiquarian Society? Can for-profit vendors of password-protected digital archives command a higher price from university libraries for access to their valuable resources?

There are many questions we should ask ourselves as we reflect on the transition from the age of Big Theory to the age of “my archive,” and they are questions that should guide our use, creation, and dissemination of digital archival resources.

Leave a Comment

 

Previous post:

Next post: