Automata in the library

More musings on the role of digital technology in the library, in response to my course at #citylis.

The following is my third and final reflection on my #citylis course on ‘Digital Information Technologies and Architectures’.

The world has turned and we find ourselves coming to the end of the first semester (already!) here at #citylis, and therefore also to the end of our course on digital technology and its impact on library and information science (LIS).

Since I posted my last reflection on this topic, the course material has gone on to cover the following areas: altmetrics, a branch of bibliometric analysis which measures the impact of research in non-traditional arenas, including on social media; coding, digitial text analysis and text mining, with a nod towards the related field of the digital humanities; and, finally, the rather wide spectrum of technological developments which come under the heading of artificial intelligence (AI).

When thinking recently about a way to sum up the last three weeks of the course, I was reminded of the model of the information communication chain, a concept initially proposed by Lyn Robinson in a paper from 2009 as being representative of the fundamental area of interest for reseachers in LIS.

screen-shot-2016-09-18-at-19-49-06
Information Communication Chain – © @lynrobinson

Above all, I started to think about the ways in which these technologies might inform our understanding of the various stages of the dissemination, sharing and management of information, and also in some instances its organisation and retrieval as well. It also came home to me just how similar they are in some ways, at least in terms of the implications they have for human agency and understanding. 

Altmetrics is a good place to start, mainly because it is a clear example of digital technology being used to aid what in some ways was (and still is) the task of LIS professionals, above all subject specialist librarians. Evaluating the significance of a particular document – a journal article, blog, dataset, etc. – in this case by measuring the amount of attention it is receiving on a number of communication channels (or so-called “impact flavours” such as blog posts, mainstream news, Wikipedia articles, policy documents, discussions on social media, and so on), allows for a new perspective on the spread of information, one which promises an alternative, more nuanced picture of scholarly communication and (re)use to that of more traditional metrics focused on citations.

Attention does not necessarily correlate with quality, of course; the figures by themselves have the potential to mislead, above all when it comes to analysing the distinct patterns of reception and discussion at work in the life-cycle of a particular research output. Nevertheless, the increasing volume of scholarly literature in circulation, its growing complexity and diversity (as made clear in a 2014 report by OCLC Research on The Evolving Scholarly Record), means that some sort of indicators as to currently trending topics and contributions to ongoing discussions can but be of help both to the acquisitions librarians tasked with collections development in a certain area and to those given custody of their institution’s digital repository.

What is more, the integration of altmetrics tools (such as Altmetric and Plum Analytics, to name but a few) with the APIs of web services like Twitter and Mendeley means that the task of gathering numbers of citations and sifting through the references to an article on social media to a great extent be automated.

In a similar way, the applications of coding and AI within LIS both allow for the automation of a number of routine tasks and the ability to attain to otherwise overlooked insights into aspects of a certain datasets (such as those recently made available by the British Library from its collections).

The two are also linked in uncanny ways. Python, one of the primary programming languages used in coding for the digital humanities, is also at the heart of several projects currently being undertaken in the field of machine learning. What is more, the techniques used in the digital humanities to perform text and data mining are likewise important parts of the related process of natural language processing (i.e. the transference of semantic, linguistically-based information contained in human speech into digital information via a series of statistical and probabilistic models). In both cases, the presence of large corpora of well-formed, ‘clean’ textual data is required.

I admit, it is sometimes difficult not to focus on the disruptive elements of such technologies. The potential use of neural networks in the automatic classification and indexing of documents, to take just one example, could have an enormous effect on LIS theory and practice when it comes to information organisation. What would prevent an AI cataloguer from deciding to classify a document in a completely different way to a human, based upon different logical or statistical criteria? Would the machine even require an information organisation system that corresponded to anything a human could feasibly understand? How would this change the process of information retrieval, for instance?

And what of other tasks in the library? To name just one recent example, the advent of an AI digital legal assistant, ROSS, currently being developed in Toronto and built on the model of IBM’s Watson, could have significant impacts on the job of legal reference librarians.

On the other hand, it would be unwise for LIS to stick its proverbial head in the sand when it comes to these technologies. As they become more and more a part of the information communication landscape and a feature of everyday reality in the way in which scholarly research and other knowledge is disseminated and conducted, libraries and other stakeholders with an interest in the documented record of humankind will naturally need to find a way of incorporating these technologies into the services they provide to their users. At the same time, however, they should still be worrying about the ethical implications these and other technologies may present.

Thanks to Lyn and David, at least our cohort at #citylis should be suitably prepared for that potentiality!