Running at the speed of Library Science ... with Liam: 2012

The URL of this coursework blog is http://library-liam.blogspot.com/2012/01/dita-coursework-blog-2-web-2030-and.html

Title: Library 2.0 colliding with the semantic web: an iceberg effect?

1. Introduction

The integration of web 2.0 technologies into online library portal interfaces is changing how library users access and interact with information sources; but there is a real danger that user input will create ‘information icebergs’ in which the volume of information expands rapidly with a build-up of a small amount of ‘searchable’ user-centric ontological metadata that ‘freezes’ above the useful primary data, pushing it below the surface where it lurks unseen and inaccessible without deep and effective search retrieval. The library’s raison d’être for making information transparent and discoverable to all users is seemingly lost in the process. The iceberg effect the writer speaks of here is akin to the concept of web 3.0 (the semantic web) which threatens to sink the traditional notion of librarianship.

A comparative analysis of how Web 2.0 technologies have been incorporated into the user interfaces of academic, private sector and public library services, and how the idea of semantic web strategies could apply to electronic library services in the future will allow the writer to consider whether such technologies are truly compatible or not, or if they ever will be.

2. Library 2.0?

Library 2.0[1] appears to be the next logical evolution of library services in our increasingly digitised online society, and seemingly follows the same trajectory path as the platform (the World Wide Web) that facilitated and now realises the concept. If Web 1.0 first allowed machine-readable data to reach the first online generation as digitally presented information, and Web 2.0 permitted the same, albeit maturing, users to read and write their own data to either add to, edit or remove online information, then it is predicted that the semantic web (Web 3.0) will allow the future generations of internet users to read, write and execute information through provision of user-created ontological metadata to bolster the artificial intelligence of computer servers in performance of our fine-tuned information needs.

Library 2.0 “simply means making your library’s space (virtual and physical) more interactive, collaborative, and driven by community needs.”[2] Libraries allow access to information collected through their materials, traditionally through read-only catalogues, and now provide online portals that add-value to the same information by encouraging user participation and feedback with those resources through integrated web 2.0 technologies. Brophy (2007) argues that the ‘long tail’[3] effect, that of more and more users joining up to a service provided over the same network thus adding value for each user, is how the idea of web 2.0 marries with digital libraries in the creation of ‘Library 2.0’.

3. Information needs

One function of web 2.0 technologies is to allow for the personalisation or customisation of information, and this is dictated by the information needs of the user - its appropriateness will depend on the nature of the library itself.[4]

Catalogues and databases are the entry portals for library users who have specific information needs. Web 2.0 can help satisfy needs by directing users to search catalogues in different ways. Computers can analyse the frequency of search terms and relate them to items accessed on the catalogue. Every item is catalogued under a specific subject and relational links are then made by the software. Tagging clouds, as used on the City University library catalogue[5], show the popularity of interrelated search terms or subjects after you make an initial search: the larger words present the most popular and therefore most seemingly relevant related searches. This collaborative employment of web 2.0 technologies and user input creates a knowledge base of information that is realised and actioned through social web interaction by writing those needs as the metadata into the software, and seemingly is the way forward. Truly speaking, “knowledge – its content and organization – is becoming a social act”.[6]

4. Digital representation and organisation

Libraries have always traditionally maintained a role in selecting, indexing and making available information resources to their users: the form in which it is now represented and organised (digital data) has evolved with technological innovations, but is limited by new constraints.

Modern libraries, hosting OPACs (online public access catalogues), now adhere to MARC formats when cataloguing which are “standards for the representation and communication of bibliographic and related information in machine-readable form.”[7] MARC signified the introduction of library catalogue records as binary code, which allows for the composition of metadata through XML schema particular to that bibliographic information. This in turn allows for implementation and integration of the data through web-hosted services such as application programming interfaces (APIs) and mash-ups[8], and consequently leads into user-manipulation activities such as personalisation and customisation.

An interface that distracts or deviates from its purpose (accessing information) because it makes navigation cumbersome or impracticable; or where it does not limit the amount of information[9] on the screen by editing the content (remedied by using markers such as hyperlinks or concertinaed drop-down menus) is an indication that the designer has not understood the relevant user information needs.

The ideology of short-of-time library ‘browsers’ also manifests itself in the online library environment – functional design of library service interfaces enables the optimal representation of information in digital form although it must be clearly signposted and transparent in order to satisfy user needs efficiently.

5. Real life ‘Library 2.0’ manifestations

The widespread of ‘Library 2.0’ can be seen across all library sectors, and present their users with radical options in accessing information and services virtually.

In public libraries, there has been a slowly expanding realisation of web 2.0 technologies outside of the conventional online library OPACs. Pinfield et al (1998) have identified the current model to be that of the “hybrid library”: a combination of a digital library within the operation of a traditional physical library, illustrated by community library services[10] now providing an e-library in which selected digital material can loaned over the internet.[11]

In private sector libraries, such as law, Web 2.0 has been used as a tool for marketing and promoting library services to users.[12] Law is a subject which is constantly evolving and information sources very quickly become out-of-date. Inner Temple Library host a Current Awareness blog[13] which provides “up-to-date information on new case law, changes in legislation and legal news” in the form of hyperlinks to online information resources. This information is also pushed through to users signed up to the Twitter feed[14] or Facebook page[15] and the blog is further supplemented by applications such as buttons to subscribe to RSS feeds, mailing lists and hyperlinks to the organisation’s online presence on other websites. Podcasts[16], instant text messaging and online chat enquiries[17] are other examples of how web 2.0 is being integrated into libraries, however Brophy (2007) notes that there is still a long way to go and in particular that “most library applications do not yet support conversational use and data sharing, including the support of individual and group activity” (at p.143).

6. Semantic library portals

The implementation of web 3.0 as semantic technologies for use in libraries has not yet been fully realised, but arguably this is not far off.

Chowdhury and Chowdhury (2003) support the use of semantic web technologies in the development of digital libraries, and argue that the seamless access to digital information resources “calls for a mechanism that permits analysis of the meaning of resources, which will facilitate computer processing of information for improved access” (at p.220).

Such a mechanism is arguably akin to a semantic information portal, as advocated by Reynolds, Shabajee and Cayzer (2004) which serves as “... a community information portal that exploits the semantic web standards to improve structure, extensibility, customization and sustainability” (at p.290).

The Inner Temple Library website, in providing links to a number of databases, blogs and other websites on one page, appears to be the base upon which to build this approach although elements of web 2.0 interactivity, in which comments can be added and attached links shared through social media website accounts, are currently missing and its usefulness and purpose in that context is still questionable as mixing work with pleasure on social networks remains a contentious area for most.

7. Below the surface – hidden depths and dangers

Digital information representation and provision needs to be transparent, and information searched for, retrieved for an accessed over a dynamic platform such as the World Wide Web[18] becomes a murky prospect to successfully accomplish.

Baeza-Yates and Ribero-Neto (2011) identify the main challenges posed by the internet in terms of searching for information which include an unbalanced distribution of data across limited bandwidth and unreliable network connections, harbouring inconsistent access due to changing file destinations and broken links; and a semantic redundancy of semi-structured data where the same HTML pages are constantly replicated through searching - added to the problems inherent in self-published information of questionable integrity being in different formats and languages.

All the above factors arguably affect the recall (if the amount of searchable information increases although with the number of data formats) and precision (if the search results present data which is not of high integrity) of information search retrieval results.

Baker (2008) identifies the hurdles to clear as the “need to develop shared and common standards, interoperability, and open access using the same concepts and terminologies and on an international scale” (at p.8).

Libraries in particular, as trusted information providers in their community, should seek the correct balance in maintaining their editorial control[19] to preserve data integrity, whilst listening to their user base in developing search strategies, using metadata to enrich their catalogue records to satisfy their information needs by reducing search time and producing highly relevant results.

8. Conclusion

To return to the ‘information iceberg’ analogy advocated in the introduction, the problems in providing discoverable information appear firmly connected to the widening of the ‘semantic gap’, which increases as more and more information is uploaded online without first being catalogued or indexed, creating an insurmountable iceberg of older unclassified information that sinks down under the surface as more and more user-generated data (and metadata) is populated online, although authenticity and reliability of such data immediately comes into doubt as it lacks authority.

In addition, search engines will need to deal with all the content afforded by web 2.0 technologies through HTML pages which are dynamically generated and therefore inherently complex”.[20] New technological advances therefore create new problems to overcome.

Van Dijk (1999) perhaps gives a stark warning as to the future of digital information caught up in Web 3.0 in foreseeing the risk and consequences of over-reliance of information agents as the semantic solution:

“Systems get smarter, but their users might become stupider” (at p.185).

Whilst computers can adapt to user preferences, they cannot react to changing human values and emotions and cannot be completely pre-programmed. Over-reliance on information devices can isolate the user from the real world and therefore miss out on interactions and opportunities one can only obtain from human contact. Traditional libraries and their human agents, in providing and supporting the resources necessary in user information searches face-to-face within defined real-world environments, avoid these problems instantly.

The information iceberg cannot be left to expand without human observation or we risk losing control, order and sight of the value of digital information.

References

[1] The term ‘Library 2.0’ was first introduced by Michael Casey in his blog entry titled ‘Librarians without Borders’ dated 26^th September 2005. See: http://www.librarycrunch.com/2005/09/

[2] As defined by Sarah Houghton-John in her blog entry titled ‘Library 2.0: Michael Squared dated December 2005. See: http://librarianinblack.net/librarianinblack/2005/12/library_20_disc.html

[3] Based in ideas propounded by Chris Anderson in his blog entry titled ‘The Long Tail’ dated October 2004. See: http://web.archive.org/web/20041127085645/www.wired.com/wired/archive/12.10/tail.html

[4] Coles (1998) identified that the information needs of users between public and private libraries are not the same.

[5] See ‘refine by tag’ window appearing on the right hand side of the screen after a user search is made.

[6] Weinberger (2007) at p. 133

[7] http://www.loc.gov/marc/

[8] A good example of a mobile mash-up application is Library Anywhere, used by City University Library, which provides access to library catalogues and user accounts via a smart phone screen: see http://libguides.city.ac.uk/content.php?pid=234596&sid=2157583

[9] Chu (2010) states the essential problem in information representation and retrieval to be “how to obtain the right information at the right time despite the existence of other variables in the [...] environment” (at p.18).

[10] The writer’s local example: Hertfordshire County Council – online library services: http://www.hertsdirect.org/services/libraries/online/

[11] Herts e-library service: http://herts.lib.overdrive.com/8F191FBA-0D95-4AA6-915E-691A653360D5/10/491/en/Default.htm

[12] Harvey (2003) at p. 37 notes that current awareness blogs can be used to remind users of [information] services they may have been previously unaware of, and also allows for innovation on the part of the blog writer in developing and improving those services.

[13] http://www.innertemplelibrary.com

[14] https://twitter.com/inner_temple - Twitter username: @Inner_Temple

[15] http://www.facebook.com/innertemplelibrary

[16] Such as that provided by the British Library, see: http://www.bl.uk/podcast

[17] For example, 24/7 live chat communication with a librarian is provided through the government supported People’s Network website: http://www.peoplesnetwork.gov.uk/

[18] Baeza-Yates and Ribeiro-Neto (2011) notes this to be “chaotic and unstructured, providing information that may be of questionable accuracy, reliability, completeness or currency” (at p.685).

[19] See Weinberger (2007)

[20] Baeza-Yates and Ribeiro-Neto (2011) at p. 450.

Bibliography

Anderson, C. (2004) The Long Tail, Wired (blog), [online] available at: http://web.archive.org/web/20041127085645/www.wired.com/wired/archive/12.10/tail.html [accessed 20^th December 2011]

Baeza-Yates, R., and Ribeiro-Neto, B. (2011) Modern information retrieval : the concepts and technology behind search. 2^nd ed. London: Pearson Education.

Baker, D. (2008) From needles and haystacks to elephants and fleas: strategic information management in the information age, New Review of Academic Librarianship, 14: 1–16 [online] via LISTA, accessed 31^st October 2011.

British Library website, Podcasts, [online] available at: http://www.bl.uk/podcast [accessed 20^th December 2011]

Casey, M. (2005) Librarians Without Borders, [online] available at: http://www.librarycrunch.com/2005/09/ [accessed 20^th December 2011]

Casey, M. and Savastinuk, L.C. (2006) Library 2.0: service for the next generation library, Library Journal, [online] available at: http://www.libraryjournal.com/article/CA6365200.html [accessed 20^th December 2011]

Chowdhury, G.G and Chowdhury, S. (2003) Organizing information - from the shelf to the web. London: Facet Publishing.

Chu, H. (2010) Information representation and retrieval in the digital age. 2^nd ed. Medford, New Jersey: Information Today, Inc.

City University Library website, [online] available at: http://www.city.ac.uk/library/ [accessed 16^th December 2011]

City University Library – LibGuides – Mobile Devices webpage, [online] available at: http://libguides.city.ac.uk/content.php?pid=234596&sid=2157583 [accessed 31^st December 2011]

Coles, C. (1998) Information seeking behaviour of public library users: use and non-use of electronic media. In: Wilson, T.D. and Allen, D.A., ed. 1999. Exploring the contexts of information behaviour. London: Taylor Graham Publishing, 321-329

Current Awareness from the Inner Temple Library website, [online] available at http://www.innertemplelibrary.com/ [accessed: 7^th December 2011].

Harvey, T. (2003) The role of the legal information officer. Oxford: Chandos Publishing.

Hertfordshire County Council – Libraries website, [online], available at: http://www.hertsdirect.org/services/libraries/online/ [accessed 31^st December 2011]

Herts e-library service website, [online], available at: http://herts.lib.overdrive.com/8F191FBA-0D95-4AA6-915E-691A653360D5/10/491/en/Default.htm [accessed 31^st December 2011]

Houghton-John, S. (2005) Library 2.0 Discussion: Michael Squared, Librarianinblack (blog), [online] available at: http://librarianinblack.net/librarianinblack/2005/12/library_20_disc.html [accessed 31^st December 2011]

Inner Temple Library Facebook page, [online] available at: http://www.facebook.com/innertemplelibrary [accessed 31^st December 2011]

Inner Temple Library Twitter page, [online] available at: https://twitter.com/inner_temple [accessed 31^st December 2011]

MARC STANDARDS: Library of Congress – Network Development and MARC Standards Office website, [online] available at: http://www.loc.gov/marc/ [accessed 31^st December 2011]

People’s Network – online services from public libraries website, [online] available at: http://www.peoplesnetwork.gov.uk/ [accessed 31^st December 2011]

Pinfield, S., Eaton, J., Edwards, C., Russell, R., Wissenberg, A., and Wynne, P. (1998) Realizing the hybrid library, D-Lib Information Magazine, October 1998, [online] available at: http://www.dlib.org/dlib/october98/10pinfield.html [accessed 19^th December 2011].

Reynolds, D., Shabajee, P and Cayzer, S. (2004) Semantic information portals, [online] available at: http://www2004.org/proceedings/docs/2p290.pdf [accessed 7^th December 2011].

Van Dijk, J. (1999) The network society : social aspects of media [translated by Leontine Spoorenberg]. London: Sage Publications.

Weinberger, D. (2007) Everything is miscellaneous: the power of the new digital disorder. New York: Times Books/Henry Holt and Company.

Running at the speed of Library Science ... with Liam

Wednesday, 4 January 2012

DITA coursework blog 2 - Web 2.0/3.0 and beyond