Thursday, 9 December 2010

RSP e-theses briefing papers

The RSP have made a number of reports and briefing papers authored by UCL, regarding e-theses concerns available on their website: (they can be found near the end of the page). The papers cover:

Influencing the Deposit of Electronic Theses in UK HE. Report on a sector-wide survey into thesis deposit and open access
Influencing the Deposit of Electronic Theses in UK HE, Appendix. Full text responses from a sector-wide survey into thesis deposit and open access
Vision, Impact, Success: mandating electronic theses. Case studies of e-theses mandates in practice in the UK Higher Education sector.
Third party copyright
Impact on future publication
Managing embargos
Policies and guidelines available on the internet
Sensitive content
Workflow analysis

Tuesday, 30 November 2010

E-thesis & Dissertation Bibliography

Digital Scholarship have released version 5 of their E-thesis and Dissertation Bibliography.

'This selective bibliography includes articles, books, conference papers, technical reports, unpublished e-prints and other scholarly textual resources that are useful in understanding e-theses and dissertations.'

Digital Scholarship have also collated bibliographies relating to:

Institutional Repositories
Digital Curation and Preservation
Open Access Journals
Scholarly Electronic Publishing

Tuesday, 2 November 2010

OA Week Competition winner: Misha Jepson, Glyndŵr University

We are pleased to announce Misha from Glyndŵr University as our OA Competition winner!!

Misha’s engagement story described the use of advocacy to both gain the attention of an institution’s senior management team and to effectively put across the benefits of a repository to an institution. The story showed the importance of the ‘Elevator Pitch’ advocacy technique in grabbing opportunities where you can to get your case heard. It also showed how aligning the use of the repository with the institution’s strategic aims can embed the system within an institution’s structure.

A copy of Misha's winning story is available here.

RAE data available for download: JISC MERIT project

A new resource which may be of interest to those who are looking to use their institutions’ RAE data to populate their repository and/ or other publication management systems.

The JISC MERIT project has just launched its RAE submissions database which contains data on every UK institutions’ RAE submission.

The database offers faceted searching on the citation data held by: institution; Unit of Assessment; output type; author. The results of any combination of these searches can then be exported and saved to Excel files.

From what I can see the database only holds citation data. It does not seem to offer full-text, links out and/or DOI look ups.

Monday, 13 September 2010

Open Access Week Competition

The week of Monday 18th - Sunday 24th October has been allocated as Open Access Week across the globe. Now in its 4th year, this dedicated week aims to promote 'Open Access as a new norm in scholarship and research'.

To mark the occasion we at the WRN are running a competition for partners with an Open Access theme. We are looking for your best repository success story. Whether it's a story of success convincing an academic researcher to interact with the repository, or a tale of success regarding a deposited item that ended up proving the wide-reaching audience of the repository.

Entries can be as long or as short as you want and we are looking to put the best stories together in a blog post and perhaps even in a new advocacy learning object!

The competition is open from now until the Friday before Open Access Week (15th October). There will be a prize available for the winning entry.

Good luck!

Learn about other Open Access Week events, contests and resources through

Friday, 20 August 2010

Repositories and CRIS article

An article has been published in the latest issue of Ariadne about the Repositories and CRIS event we ran in Leeds in May this year. ‘Learning how to play nicely: Repositories and CRIS’ is available from

The full contents of the journal issue, which may also be of interest, including articles on e-books, Library 2.0 and data management is available from

Wednesday, 11 August 2010

Annual growth figures now available

Just a quick post to let everyone know that I have now collated our latest batch of statistical data which means we now have growth figures covering a full 12 month period. Overall, we have seen a very healthy 43.73% growth in the number of items within our repositories over the past year - well done all!

Wednesday, 4 August 2010

UKCGE Report on PhD Theses Confidentiality

Interseting report from Tina Barnes, UK Council for Graduate Education looking at the issue of confidentiality and embargo requests on PhD theses: Report based on a survey conducted in March 2010 with refelctions to previous 2005 survey on the same topic.

Barnes reports that the most commonly cited reason for an embargo is the protection of 'commercial interests.' However, the number of requests has not increased since the first survey in 2005 despite the progression of open access and e-deposit. This, it is claimed, is due to e-submission and repository deposit not yet becoming standard practice within UK HEIs.

The report also comments on alternative approaches to the e-presentation of embargoed theses such as 'embargoed appendices.'

Tuesday, 3 August 2010

New IPR discussion papers

Two new IPR discussion papers have passed under my nose in the last couple of days that others may find of interest:

Korn, N and Oppenheim, C. July 2010. JISC IPR and Licensing White Paper: A Discussion Piece. Version 1.0.

British Library. Driving UK research: is copyright a help or a hindrance?- a perspective from the research community.

Both of these pieces question current IPR and copyright practices and the detrimental effect they may be having in the digital age and to current research and research practices.

Thursday, 15 July 2010

Reflections from OR2010: Part 2

Another activity I was involved in at OR2010 was the Developer Challenge as a ‘non-techie’ judge. Organised by the DevSci project (managed by UKOLN, funded by JISC), this year’s challenge was to ‘create a functioning repository user-interface, presenting a single metadata record which includes as many automatically created, useful links to related external content as possible.’

The winning entry was from Richard Davis and Rory McNicholl, University of London Computer Centre, who enhanced the records of the Linnean Collections, held on EPrints, which the ULCC are responsible for. As many of the metadata fields in the record as possible linked out to external sites- some general such as Google and Wikipedia and some more subject specific such as horticultural indexes. Although only one metadata record was demonstrated, the links which appeared in the record were determined by the entries on a master sheet (an excel spreadsheet) and therefore, would apply to all records within the repository. This development came out top as although the links weren’t truly automated, they were managed externally, it was felt that this was actually advantage for a non-techie repository manager could update for themselves rather than calling on the support of a tame developer.

Coming in a narrow second was ChEsis, presented by Sam Adams, University of Cambridge, which created links to enhance Chemistry e-thesis records. Links were available to show chemical structures of molecules/ crystals used/ created along with their mass spectrums. Fun links were also included such as the Last FM playlist of the student when their thesis was submitted and the BBC headlines for the day.

Another entry utilised OpenCalais to automatically create links from their created repository record. OpenCalais is a free to use service which automatically creates links from open content to other open content sites such as Twitter, YouTube, Flickr etc. It can be used to add a bit of fun to any open source web content such as a blog but be warned the links are automatic and you can’t necessarily restrict what content it links to!

A full write-up and videos of all the Developer Challenge entries is available via DevSci blog.

Wednesday, 14 July 2010

Reflections from OR2010: Part 1

Last week Antony and I attended the 5th International Conference on Open Repositories in Madrid. The conference boasted a fully packed, 4 day programme including ‘General’ presentation sessions, User group sessions, working groups and forums. Nearly 500 delegates were in attendance, representing countries from all across the globe.

One of the reasons Antony and I were in attendance was to present a Poster, authored in conjunction with Glen Robson and Ioan Isaac-Richards from the NLW, about the work of the Welsh e-theses harvesting service. A copy of the poster is available from the Aberystwyth University repository CADAIR.

With parallel streams running for the majority of the programme there were too many sessions for one person to attend- let alone comment on- so below I’ve discussed the sessions I found of most interest and relevance to the work of the WRN.

The first couple of interesting sessions related to nationwide open access/ repository support networks: the first located in Germany; the second located in Australia. The OAN (Open Access Network) initiated by the DINI (German Initiative for Network Information) and funded for a two-year term by the German Research Foundation (DFG), has created an over-arching infrastructure between quality certified German IRs to act as a single interface for research promotion and to support other DINI Open- Access projects. DINI certification, a certificate of IR quality, denotes that an IR utilises international standards, such as DRIVER for metadata, has determined and makes its policies regarding use clear and available, and is well-positioned within both its own institution and the greater open access arena.

The OAN harvests data from the DINI certified repositories within Germany, aggregates the data and puts it through a number of value added modules such as data clean-up, FT link finding, OCR, and citation tracking. The aggregated data is then presented within a single search interface, and acts as a single point for data export and further harvesting. It also acts as a single point for the other OA projects, some of which were presented at OR2010, such as OAS (Open-Access Statistik) and OAFR (OA Subject Based Repositories).

The OAN is also responsible for increasing the number of certified repositories and offers support to repository managers in order for their repository to achieve certification. The alignment of WRN repositories, specifically in the area of policies, is an area of focus for the WRN team this autumn so the process of DINI certification will work well as a basis for this process.

Caroline Drury, University of Southern Queensland presented on the ANDS (Australian National Data Service), a service looking to inform and influence national policy on the curation of data. ANDS has created Research Data Australia, a central collection of curated data sets produced by Australian academics. ANDS also offers the following services: Publish my data; Register my data; Identify my data; which are related to this central collection. Also based at Queensland is Tim McCallum, the technical support half of the CAIRSS repository support team (the team resembles that of the WRN team with one technical and one organisational support officer). Piggy-backed on to a CAIRSS repository survey, ANDS has been investigating the data management practices at Australian Universities. This survey found that there was a low-level of repository manager involvement within the University in regards to data management, a trend that ANDS are looking to change with Senior Management intervention, in conjunction with CAIRSS. Data management is a new area of interest for the WRN so we will be watching the progress of ANDS with interest.

The other session of direct relevance and interest in regards to the work of the WRN, and more specifically the poster presented e-theses harvesting service, was from Nikos Houssous, National Documentation Centre (EKT), Greece. Nikos was describing the National Archive of PhD Theses developed at EKT, a single search interface presented within DSpace. Like the NLW in Wales, the EKT have a historic role in the collection of Greek PhD theses, a role they were looking to extend to the digital realm. The EKT are undertaking a digitisation project of the PhDs currently held in print form, as well as encouraging institutions to submit theses electronically. Records are held in a bespoke theses admin system and then pushed to both the DSpace system (via SOAP in ETD-MS (a metadata standard for e-theses devised by the Networked Digital Library of Theses and Dissertations (NDLTD)) and to the EKT Library Catalogue (via Z39.50 in UNIMARC). The DSpace collection also forms a central harvesting point for DART Europe, a service aggregating PhD theses records for the whole of Europe. I was unaware of NDLTD and ETD-MS before Nikos’ presentation and their relation to DART is of interest to the next stage of the Welsh e-theses harvesting service.

Through other sessions and networking I became aware of two other national aggregation services: NARCIS in the Netherlands and RCAAP in Portugal. Whereas RCAAP is an aggregation of IR content, NARCIS is an aggregation of IR and National information, such as DANS (Data Archiving and Networked Services). There are also plans to incorporate the data from METIS the Dutch national CRIS, which will provide much richer information about researchers and their projects. Anecdotally, the NARCIS presenter reported that theses and dissertations were the most frequently retrieved items through the system, perhaps as NARCIS provided the only central point of discovery for these types of items.

It’s certainly nice to know that the work of the WRN parallels that carried out within other countries and that we have an extended network to call upon when in need of best practice advice.

Monday, 12 July 2010

2010 Ranking Web of Repositories

The second edition of 2010 Ranking Web of Repositories has been published:

Close to 1000 repositories have been analyzed this year and the top 800 are ranked here according to their web presence and visibility. The aim of this ranking is to support Open Access initiatives and therefore the free access to scientific publications in an electronic form and to other academic material. The web indicators are used here to measure the global visibility and impact of the scientific repositories. Two lists are available - top 800 and top 800 institutional.

I've done a bit of trawling and number crunching on the institutional list extracting both a ranked list for UK only institutional repositories, and a subset of those Welsh repositories that appear in the list. Of the top 800 institutional repositories globally the UK has 82 entries and Wales has five entries from those. Details:

International rank (UK rank in brackets)

257 (18th in UK) Aberystwyth
601 (62nd in UK) Bangor
696 (73rd in UK) Glamorgan
730 (78th in UK) UWIC
752 (80th in UK) Trinity

Bits and bobs

Over the past week or so I've collected together a few random repository related items that might be of interest to our partners. Enjoy!

Copyright Workflows
Ann Hanlon and Marisa Ramirez. "Asking for Permission: A Survey of Copyright Workflows for Institutional Repositories" 2010
Available at:

This poster details the results of a US survey about copyright workflows and was presented at the Annual Conference of the American Library Association, Washington, D.C. in June 2010. Exploring staffing, resources, activities and tools employed to clear copyright for published work, with the intent to deposit into an IR, this nicely summarises their preliminary findings. In 2008 a survey was undertaken in the UK on the same topic:

Jones, Mark. Intellectual property rights survey, University of East Anglia, 2008

New Team Digital Preservation Film
WePreserve and Planets have released their fourth Team Digital Preservation film. Team Digital Preservation and Arctic Mountain Adventure is available to view at

Digiman is baby-sitting his niece and nephew for the weekend, but things go horribly wrong when he sends them out on an arctic mountain adventure. Never fear trusty viewers, PLATO, the Planets Preservation Planning tool, comes to the rescue to show Digiman the error of his ways.

Other editions of these popular videos are available here

Metadata Forum
At the Open Repositories Conference 2010 last week in Madrid the Metadata Forum was officially launched. A new initiative, run by UKOLN at the University of Bath and funded by JISC, the Metadata Forum is planning four face-to-face meetings throughout the UK and ongoing conversations online where anyone who has an interest in metadata can ask for help, share experiences and learn from others. The Forum is open to everyone, from novice to expert and anyone in between who deals with metadata in their day-to-day work.

Get involved by following the Forum blog - or following the Forum on Twitter – @MetadataForum

SWORD v2.0: Deposit Lifecycle white paper

The aim of this paper is to stimulate discussion around introducing more complete treatment of "deposit lifecycle" management of objects in digital repositories, and to propose the next small steps in this direction. Abstract:

"SWORD is a hugely successful JISC project which has kindled repository interoperability and built a community around the software and the problem space. It explicitly deals only with creating new repository resources by package deposit a simple case which is at the root of its success but also its key limitation. This next version of SWORD will push the standard towards supporting full repository deposit lifecycles by using update, retrieve and delete extensions to the specification. This will enable the repository to be integrated into a broader range of systems in the scholarly environment, by supporting an increased range of behaviours and use cases."

Tuesday, 29 June 2010


The WRN team now have a delicious site available at

Delicious is a social bookmarking service that allows users to tag, save, manage and share web pages centrally. For more information about social bookmarking a useful explanation is available on Wikipedia.

Here in the WRN offices we started to use delicious to gather together sites of potential interest to us internally as a project team, but we've quickly come to realise that having access to these links would also be of use to the wider repository community within Wales.

screen shot of delicious site

There are various ways of exploring content on delicious, I find one of the most useful is to use the tags list from the right hand menu to look at sites gathered together under various themes. This will quickly take you out and beyond the sites we've gathered at WRN into the wider collection of URLs from the whole site.

As with many other Web 2.0 tools making them useful and developing a useful community actually takes a lot of effort - and to be honest our work on populating the site with links to date has been sporadic. It is another of those changes to the way we work and think that doesn't yet happen automatically. However, with time we hope to improve this situation and keep adding useful sites as we come across them.

If you know of any sites that would be useful to add to our page, or if you would like to become an active contributor to our site, then please just drop us a line via the usual email

Friday, 11 June 2010

Advocacy discussion: barriers and solutions

As part of the Repository Stream at the Gregynog Colloquium we held a discussion session on the hurdles faced by Repository Administrators when trying to encourage academic buy-in to their systems. These have been listed below and grouped into topics.

As part of the discussions we also suggested solutions for each of the obstacles. These appear after each problem raised in a different colour. The solutions are by no means exhaustive and there are some gaps.

Please add comments and suggestions to the list below, and suggest advocacy ideas that have worked for you. It is hoped this exchange of ideas will aid both our WRN community and the repository community as a whole.

Perception of time and effort required

No time
Demonstrate ease of deposit. Video materials to demo deposit using academic champions. Practice reduces time. Look into automatic completion APIs for repository.
Extra admin work
Mandate. Suggest using admin staff or PhD students to help- good practice for new researchers.
Backlog of research will take too much time to enter
Offer self-deposit to relieve backlog then encourage self-deposit. Suggest using admin staff or PhD students to help.

Benefit of repository interaction

What’s in it for me?- Apathetic to the process
Education- more widespread audience; greater recognition; higher/ faster/ sustained citation rates. Demonstration of RAE impact. Use of peers as champions. Video materials?
The paper is already published- anyone who wants to read it already has
More widespread audience- publically funded research available to whole of the public beyond subscription barriers.
Takes time to see benefit
Difference between print and electronic world?

Perception of repository importance

Lack of integration with other Uni systems and processes
Top-level buy-in to push for integration/ Mandates
Repository is an archival end point
Education on benefits- use as Management Information tool
Perceived value of system through lack of dedicated staff time
Top-level buy-in to fund positions to administer repository. Use further staff network- subject liaison; research administrators- to spread load and form experts for each school/ collection.

Copyright and IPR issues

Unsure of copyright status in papers
Use of SHERPA RoMEO/ include API on repository front page
Unsure of what was signed away with publishing license
Education. Feedback from academics to publishers. UKCoRR MoU
No longer have copies of different versions
Worries about plagiarism and IPR protection
No real difference between print and online world. Getting the paper out on the web and recognised as author’s work should counteract plagiarism risk. Benefits associated with citation rates and recognition should outweigh IPR risks.

Conflicts with traditional publishing

Publishing within a prestigious journal the priority
Use of OA funds to encourage OA publishing
Older research is no longer felt relevant
Evidence of older PhD work being requested for digitisation as now informs modern research.

Other issues

Collection policy confusion- what can be accepted
Have clear collection policy stated within repository site FAQ
Can the repository take different file types?
Have clear collection policy stated within repository site FAQ- the repository can store diff file types but can end users access them easily?/ Preservation.
Don’t want to make draft version publically available

Gregynog Repository Stream

The presentations delivered during the Repository Strand at this week's Gregynog Colloquium are now available online on our project website or follow the relevant links below.

The Power of the Mandate Sue Hodges, University of Salford.
Research Publishing at Swansea University Alex Roberts, Swansea University.
Research management system at the University of Glamorgan Leanne Beevers and Neil Williams, University of Glamorgan.
Developing a repository: caring, sharing and living the dream Misha Jepson, Glyndŵr University.
Encouraging author self- deposit at Cardiff University Tracey Andrews and Scott Hill, Cardiff University.
Using statistics as an advocacy tool Nicky Cashman, Aberystwyth University.
Advocacy: the theory Jackie Knowles, WRN.

Tuesday, 8 June 2010

New IR Cross Search Service launched in Ireland

RIAN is a newly launched cross-search service for content held within 7 HEI IR's in Ireland- DCU, NUIG, NUIM, TCD, UCC, UCD, UL. An outcome of a Strategic Innovations Fund project, it was sponsored by the Irish Universities Association (IUA) and funded by the Irish Higher Education Authority (HEA).

The aim of the service is 'to harvest to one portal the contents of the Institutional Repositories of the seven university libraries, in order to make Irish research material more freely accessible, and to increase the research profiles of individual researchers and their institutions.'

Thursday, 3 June 2010

DSpace Add-Ons

The following are descriptions of a range of DSpace add-ons available to install which provide additional functionality to the software.

The commenting feature brings informal communication capabilities to the DSpace environment. A threaded forum, or comments stream, can be attached to any web-page, community, collection, submitted item or e-person within DSpace. The add-on allows comments to be inserted by both anonymous users and authenticated ones while functionality for reviewing/moderating comments is also provided.

An example (in Portuguese) of comments appearing below a collection.

Compatible with DSpace 1.1 and 1.2 possibility of updating for DSpace 1.5.x

Controlled Vocabulary/Ontology

This add-on applies a subject classification system of the institutions choice to their DSpace instance. Once implemented the user chooses from the predefined taxonomy of keywords to describe items of information that are being submitted to the repository and that same taxonomy is used to find and access items held in the repository.

An example (in Portuguese) of a subject classification system in use.

Compatible with DSpace 1.1 and 1.2 possibility of updating for DSpace 1.5.x

Dublin Core Meta Toolkit

The Dublin Core Meta Toolkit gives DSpace administrators the ability to convert large amounts of information from their desktop database programs into DSpace compatible Dublin Core metadata. The toolkit provides a number of out-of-the-box database structures to ease data collection as well as enabling users to create custom converters for existing databases. The Toolkit is ideal for converting formats from Microsoft Access, MySQL and comma delimited value (CSV).

Compatible with DSpace 1.5.1


Content submitted to a repository may be restricted by laws, policies, or contractual obligations that require the submitter not to publish or enable public access to the content for a period of time.This add-on allows DSpace administrators to build in functionality to handle embargoed items in the workflow. It allows for the metadata of the embargoed item to be indexed and viewed, but the full text of the item cannot be retrieved while the embargo is in force.

Compatible with DSpace 1.4.x, 1.5.x and 1.6

Format validator and virus check

This add-on provides rough-and-ready format checking by identifying that the file/bitstream extension matches formats verifiable by JHOVE. Currently DSpace accepts a deposit's file extension as gospel, so a user could tack a .txt extension onto a GIF and DSpace would assign the incorrect format to the file based on that incorrect extension. It also checks the file for the presence of viruses.

Compatible with DSpace 1.4.x and 1.5.x


This DSpace content-based recommendation feature automatically shows links to articles within the repository that a user is likely to be interested in by mapping items related to the document currently being visualized by the user. Similar to functionality seen on Amazon this feature can greatly improve user experience.

Compatible with DSpace 1.1 and 1.2 possibility of updating for DSpace 1.5.x

Request Copy

This add-on creates a semi-automated mechanism whereby would-be users can request and authors can email an individual copy of a full-text deposited within the repository whose full-text access privileges are set to restricted.

The purpose of this feature is to increase both the content deposited in an IR and its immediate usability by providing a way to accommodate the (frequently unfounded) worries of authors and their institutions about copyright infringement during any publisher embargo periods on public self-archiving.

The link is provided on all non-OA items and activates a form where the user requester must enter his/her email address and name, and may add a comment, and press a 'Request-a-copy' button. An email is sent to the depositor and the email message contains a token. Using that token, the author may reply, by just clicking in one of the two buttons available: 'Send Copy', 'Don't send copy'.

Compatible with DSpace 1.4.x and 1.5.x

Semantic Search for DSpace
Semantic Search allows intelligent search of DSpace content, using Semantic Web technologies and performs knowledge discovery on DSpace metadata. Semantic Search uses the science of meaning in language, to produce highly relevant search results.

Compatible with DSpace 1.4.2 possibility of updating for DSpace 1.5.x

This add-on allows gathering, processing and presenting usage, content and administrative statistics from the repository. The system is based on components that can easily be configured, changed or extended, to respond to different information needs.

Compatible with DSpace 1.4.x and 1.5.x (JSPUI)

The add-on allows a tombstone to be added when an item is withdrawn from the repository. The user selects from 3 reasons for withdrawing the item: 1. Removed from view by legal order; 2. Removed from view by the [authority doing removal]; 3. Removed from view at request of the author.

Compatible with DSpace 1.4.x and 1.5.x

Further details about this range of options are available here. Any WRN partner interested in discussing or investigating any of these DSpace add-ons should contact the team using the usual address

Wednesday, 26 May 2010

Statistical Evaluation

The WRN project team is currently looking in some depth at evaluating our project activities. We are concerned with gathering both non-numerical qualitative data to analyse about our activities (stories/opinions and narratives from our users) alongside some more quantitative statistical measures about repositories and their use across Wales.

Surprisingly, the collection of the statistical aspect of our evaluation data - something we originally envisaged as being the quick and easy stuff to generate - has proved to be quite problematic. Establishing a base line set of measures has been difficult with varying data coming out of everyone's systems and a lack of consistency in obtaining measures for central recording purposes. Even the most basic measure of all, i.e. how many deposits are recorded each quarter in each repository, can be difficult to obtain and we are only just managing to make this measure something we accurately record in 100% of the repositories across Wales.

So, while we hear lots of stories about the power of statistics and the help they can offer in making a case for a repository, it seems that we still have some work to do to convince people it is worth the effort of setting up robust statistical measures. We thought we'd try and address this by providing information about a selection of basic options open to most repository managers. The following information, from the Digital Repositories InfoKit, provides an overview of some of the most commonly employed methods of collecting statistics:

Any WRN partner interested in reviewing their statistics and collection methods, or needing assistance in setting up any of the tools mentioned here, should contact the project team via the usual email at

Monday, 17 May 2010

CRIS Event Cafe Society Write Up - Group 4: Data Quality

At the JISC/ARMA Repositories and CRIS event 'Learning How to Play Nicely' held at the Rose Bowl, Leeds Met University on Friday 7th May the afternoon was dedicated to a cafe society discussion session. Four topics were explored by delegates and over the course of four blog posts we are disseminating the facilitator reports from each session.

Please use the comment option below to contribute or comment on these discussion topics.

Group 4 - Data Quality
Facilitator: Simon Kerridge, ARMA

The issue to be discussed was Data Quality and it was framed as “How do we ensure data quality in our systems? What are the best methods for getting data out of legacy systems?” however a number of related issues also cropped up in the discussions

The time was split into four 30 minute slots with delegates attending as many times as they liked. Some issued were identified on many occasions and others less often, most are presented.

Unique Identifiers - (for many, perhaps all data items) was considered to be a big issue. Examples included:
• PersonId: usually not a single one is used in an institution; the various systems (eg HR, CRIS, IT, PGR and others) generally used different ids. Moreover the HR system, which might seem like the obvious primary source, might have multiple entries for the same person (if they had more than one contract), but worse, only usually had entries for paid staff – there are many examples of unpaid people involved in research.
• FunderId: many expressed problems with de-duplicating similar looking funders. It was thought that the funders themselves could/should provide a unique reference

Authority Lists
• Even if an institution could de-duplicate all their own data and use a single id internally, it was likely that other institutions would not use the same system and so exchange of data would be problematic. This could be resolved by an agreed independent authority (for example staff HESAid). But one does not exist for (for example) Funders. This was thought to be something that would be extremely useful.
• A national policy on national data (eg FunderId) was seen as desirable
• Scopus / WoS / Pubmed were seen as possible partial authority lists for publications (and authors) but they contain differing information and do not cover the whole spectrum – and indeed not worth using in some subject areas

Data Quality
• Many places have a feedback loop (eg monthly show academic staff what has been added to their profile).
• Use carrots and sticks, eg only allow publications from the IR/CRIS to be used on internal promotions or for the annual report
• One stick method that was generally liked was the Norwegian system where in order to receive public funding for a research project a prerequisite was that all of the authors publications (where possible) had to be submitted to an open access repository
• Good enough is good enough
• Data should be re-used where possible, but only where it is appropriate; sometimes systems can be developed organically to meet too many requirements and end up not doing any of them well
• Try to think about potential future use of data and collect what you might need – but don’t go overboard. For example one institution has additional classification for all publications using the library of congress system, but so far has not used that meta data
• Have processes in place to check data quality on input and as a secondary check to ‘approve’ the data – one institution has a ‘checked by Carol’ flag!
• In general self-archive was not approved of due to the lack of quality and copyright checking
• There is some good software available for data quality checking against publications (using Scopus / WoS / PubMed data) and for data aggregation
• One institution uses Lieberstein string comparison to help identify possible duplicate entries
• The RAE / REF was seen as a good driver for increasing data and data quality
• Periodic data maintenance and cleansing is essential, but often not undertaken – data quality is unsexy!

Data Sharing
• Authority lists would make this much easier – surely some work can be done in this area?
• Two institutions recounted the issues of doing a joint submission to the RAE and the data fusion issues. It simplified a later choice of IR, the second institution simply plumped for the same as the first

Parallel Systems
• Many reported using parallel systems within their institutions as the data in the (normally) central system was simply not trusted by all the users.

• It was universally agreed that problems tended to occur where an issue was not given a high enough priority by the institution. For example, if a DVC took an interest in the quality of data in the IR then resources were made available to improve the processes and data quality.

Legacy Systems
• Often resources were made available for moving data from a legacy system to a new one
• However this was often seen as solving data quality issues, whereas in reality it is an ongoing issue, but often not resourced as such

Primary Data Source
• It was agreed that there is not one system for all an institutions data needs. Indeed that might not be desirable as individual systems tend to meet different requirements.
• However it should be known where the primary data resides, understanding that for a single record (eg information about staff) this might not all be in one system

Summary (the facilitators view of the discussions)
Overall the discussions were very open and positive. Many participants took away some ideas for use in their own institutions. Most were also sure that they would not find it easy to get the resource required to do a proper job in improving their data quality. Some systems were reportedly working very well, other systems were not. In general the former were the result of new developments whereas the latter tended to be systems that have been in use for a while. Hopefully this is the result of better new technology being used to support processes; however it seems likely that the reason is more to do with system being neglected once they are seen as being embedded and working.

CRIS Event Cafe Society Write Up - Group 3: Stakeholder Engagement

At the JISC/ARMA Repositories and CRIS event 'Learning How to Play Nicely' held at the Rose Bowl, Leeds Met University on Friday 7th May the afternoon was dedicated to a cafe society discussion session. Four topics were explored by delegates and over the course of four blog posts we are disseminating the facilitator reports from each session.

Please use the comment option below to contribute or comment on these discussion topics.

Group 3 - Stakeholder Engagement
Facilitator: William J Nixon, Glasgow University

The afternoon session of the “Repositories and CRIS” was an opportunity to bring Research Office and repository staff together across a range of topics and to draw lessons from other institutions, raise issues and share experiences. The focus of the discussion was “Stakeholder Engagement with the questions: “Who are the main stakeholders and how do we engage them? What do academics think?”. Over the sessions the focus was with researchers, research office and repository staff – but we acknowledged that there were many other stakeholders for our systems. These include funding bodies, University management, JISC, HEFC and RMAS amongst others.

The café approach to these sessions enabled attendees to stay for as long as they wanted, to move on to other sessions, and in some cases to return. Many of the initial attendees stayed across the first two sessions. The sessions had a good mix of research office and repository staff, both attending and contributing.

Key themes
• Key stakeholders – who are they?
• Workflows- what comes first the CRIS or IR?
• Carrots and sticks

Key stakeholders
In each session, there was an opportunity for staff to identify themselves as either research office or repository staff which was a useful starting point.

The initial discussions in each of the sessions, some of which overlapped considered who are the key stakeholders, with a particular focus on academic staff. It became clear very quickly that it was insufficient to talk just about researchers as a homogenous group and there was some discussion around unpacking them, guided not by discipline or research itself but by the nature of their funding and the length of their post, so we identified
• Researchers
• Contract staff
• PhD staff

These roles have created a shifting landscape not only for researchers themselves and their work/funding but for the CRIS/IR staff who support them.
The discussions here were then around how much do these staff know about, or are aware of the CRIS or the repository, in order to set a baseline for engagement. It was felt, certainly for IRs that there was still insufficient awareness of these -“invisible services”.

One approach which some institutions have begun to do is to provide build in sessions on the IR and CRIS as part of new research staff’s training. This opportunity to embed this information into existing courses was felt to be very valuable

At other institutions IR staff have been invited to be involved in Research Staff meetings and conferences.

Other Library and research office staff were recognised as stakeholders and as these CRIS and IR services have matured beyond a “project set-up” it is also necessary to inform and to engage them.

Some institutions have worked to inform and update their subject Librarian staff to act as advocates for the IR and for open access; others though preferred to manage this through the smaller repository team who they felt were better able to answer the range of queries which academic colleagues would ask. These include copyright, versioning issues and funder compliance.

Workflows and scope- what comes first the CRIS or IR?
There was some discussion, particularly around researchers and their publications about what should come first, a record in the CRIS and then as appropriate fed through to an IR, or should a publication just be deposited or entered into the IR. A third option was an additional publications database which was not part of the CRIS or the IR.

In some institutions the CRIS is or will be used to store the publication data while the repository is only used to hold the full text. A key challenge for one institution was the move to a CRIS for managing its publication with the expectation that research staff would manage their own publications. This was in contrast to the mediated service which the Library had provided [but was felt to be unsustainable in the longer term]

Questions here ranged around: who would manage the import of this data, its management (“clean-up”) and its acceptance/review. We also considered acceptable turnaround times for managing any review of the data before it became live – and how that could that be used to support engagement with staff.

Different workflows and staff resources were also covered. These ranged from self-deposit/submission to a wholly mediated service just done by the Library for the IR. This seemed to be less of an issue for data for the CRIS.

The need to engage with departmental administrative staff as a stakeholder group was identified here as one solution for this issue. These staff have the local knowledge and many are in departments dealing with publications, the CRIS and web pages.

Working with them for the IR (and CRIS) is a good way forward. Some institutions have taken this forward and provide training and support for these staff for the IR in a similar fashion to that provided for the CRIS.

Repository staff in particular also had concerns about the focus on bibliographic data for their CRIS or their IR if was a mix of full text and bibliographic data, if the importance of the need for full text was lost.

The comment was also made that the “repository is a set of services” not just an entity in itself – and one which can take on a range of roles including digital preservation, research assessment and open access.

Different institutions approached this differently and it was felt that there was no right answer or one size fits all, different institutions and the needs of different stakeholders will dictate the workflows but the need to engage staff at all levels is crucial. It was felt that this was most effective when the CRIS and IR could demonstrate valued added services [“carrots”].

Carrots and sticks
There was a considerable amount of discussions around the “carrots and sticks” for depositing material into the repository, or dealing with it in a CRIS. Did these help or hinder the engagement with stakeholders? Some of this flowed from the concerns over the sustainability of the mediated approach to deposit, the range of content which may be accepted to the IR and its public availability [a need for a dark archive?].

Carrots (and value adds):
• Increased visibility in Google
• Re-use of content in the IR (or CRIS) in personal websites etc
• The inclusion of citation data from Google Scholar, Scopus, Web of Knowledge
• Business intelligence opportunities
• Inherent value of discovery/availability
• Adding value to the research agenda

• Publications polices
• Funder mandates
• Professional development and review documentation

Final comments
This was a dynamic and rolling discussion throughout the afternoon with good mix of repository and research staff across a wide range of stakeholder and engagement issues. This short report provides a flavour of the key themes which emerged and were explored across the 4 30 minute sessions. In addition to those already detailed other issues raised included questions about research data be held, when and by who.

It was clear the research office and repository staff are engaging with a wide range of stakeholders in a variety of different ways, with varying degrees of success. Increased co-operation, co-ordination and a shared understanding of the work each group is doing.

CRIS Event Cafe Society Write Up - Group 2: DIY v. Commercial Solutions

At the JISC/ARMA Repositories and CRIS event 'Learning How to Play Nicely' held at the Rose Bowl, Leeds Met University on Friday 7th May the afternoon was dedicated to a cafe society discussion session. Four topics were explored by delegates and over the course of four blog posts we are disseminating the facilitator reports from each session.

Please use the comment option below to contribute or comment on these discussion topics.

Group 2 - DIY v. Commercial Solutions
Facilitator: Anna Clements, EuroCRIS

Format of discussion
Introduction from each member explaining what systems/s had at moment – IR, CRIS or both ; whether DIY or commercial and whether considering going commercial if not already. Most had an IR but very view had a CRIS. Then discussed criteria to consider and other issues to think about when choosing DIY v Commercial – not in priority order:

Institutional requirements - Differ depending on size and particularly how research active the institution is [or would like to be] i.e. DIY may be fine for smaller, less research intensive institutions but larger, more research intensive institutions may find it easier to justify investment in commercial solution

Cost - Need to include total cost i.e. cost for in-house development and maintenance over lifetime of systems needs to be included. Senior managers often think a DIY system is ‘free’ as don’t see cost of internal resource.
Need to consider total cost across sector if we are all reinventing the wheel – one commercial product estimates they have spent at least 12 man years developing their product; also consider benefits in collaborative approach to development where several Institutions working with a commercial supplier to build/improve product collectively and therefore share costs and benefit from better overall product

Control/Scope Creep - Two views on this :
1. DIY allows full control and so get exactly what you want – whereas commercial may deliver 75%
2. DIY ends in continual scope creep as difficult to say no internally – whereas with commercial products boundaries are clearer

Link to internal systems - Is DIY better here ? … but issue more is that there should be a buffer between each system and the CRIS e.g. at St Andrews have a data warehouse which acts as a data broker between the source systems [e.g.. Human Resources, Registry, Finance] and the CRIS. If change made in source system then can reconfigure the views in the data warehouse to match the new source system but leave them unchanged as far as CRIS concerned. If this doesn’t exist then have problem of reconfiguring links whether DIY or commercial -> for latter, therefore, important that architecture of any commercial solution can cope with sync changes without major rewrites -> include this in your tender requirements.

Understand your data - Related to point above. DIY or commercial you need to understand what data you have in which systems at the Institutional level; which is the golden source where there are multiple and what keys/ids you can use to related data together when pull it all into the CRIS – this could mean considerable investigation, data tidying and work to review/improve data flows and related procedures to ensure good quality data going forward. At St Andrews we have found that such work leading on from the CRIS is beginning to feed through in an overall improvement in information management at the University. CRIS is ideal for this because so many stakeholders within University are involved [Researchers, Schools, Library, HR, Registry, Finance, Research Policy/Management, Senior Management] and NEED to be involved whether as users of the CRIS or as data providers for the CRIS. One benefit of commercial system could be that it insists on better quality data via business rules, such as always having a primary key (!) than a DIY solution.

Product coverage - Be clear what each commercial product offers compared to what your requirements are. An example being whether you are looking just for a publications management system or a full-blown CRIS with links to students, staff, projects, events and activities

Switching from DIY to commercial - At least two institutions are trying a simple DIY solution first to find out what exactly is needed … with a view to switching to commercial product later. Disadvantage of this approach is that may then be difficult to persuade senior management to, as they see it, throw away the internal investment, at a later date.

Open source CRIS? - Question was asked that perhaps there is a third way ;) - not commercial ; not DIY alone ; but DIY together i.e. an open source solution. Why hasn’t that been done? Possible reasons that no academic interest in pursuing this [unlike for open access]; CRIS seen as a management information tool , as Finance or HR, rather than a tool for individual academics.

CRIS Event Cafe Society Write Up - Group 1: Drivers

At the JISC/ARMA Repositories and CRIS event 'Learning How to Play Nicely' held at the Rose Bowl, Leeds Met University on Friday 7th May the afternoon was dedicated to a cafe society discussion session. Four topics were explored by delegates and over the course of four blog posts we are disseminating the facilitator reports from each session.

Please use the comment option below to contribute or comment on these discussion topics.

Group 1 - Drivers
Facilitator: Andy Mc Gregor, JISC

This session was designed to explore the issues that are driving the development of research management systems, processes and policies in universities.
This document reports on the issues raised during that session by the many people who joined in over the 2 hour course of the discussion.
During the session we looked at the drivers, then considered the ways that institutions were choosing to address those issues and finally used these approaches to develop a rough and ready action plan for institutions wishing to look at research management.

REF – the Research Excellence Framework was a clear priority for many of those present.

Efficiencies – many people felt that a joined up and embedded research management system would stop effort being duplicated and make some tasks much easier than they are at present freeing staff time to be spent on other tasks.

Funding – a good research management system could help institutions understand, monitor and manage research funding more effectively and enable it to target bids for funding in a more managed way.

Funder mandates – many funders are mandating the storage of research outputs and research data, a research management system could help institutions comply with such mandates.

Legal compliance – a research management system could help institutions manage compliance with data protection and freedom of information requirements in a more efficient and joined up way, greatly reducing staff time that needs to be spent on these tasks.

Business information – the information held by a research management system could provide valuable information about the operation of the institution such as identifying successful research clusters, or areas for potential collaboration. This would enable the institution to provide more focused support to researchers.

Business processes – the research management system could help institutions refine some of the processes and workflows for research and administrative tasks. This would make it easier for researchers to manage the administrative part of their research. It could also make it easier for researchers to fulfil obligations to funders and could support a more effective link between institutional and funder information and systems.

Benefiting research – a research management system could use the information about the institutions research to provide useful services to researchers. This could be something like a directory of expertise or a service to explore research happening in other institutions.

Open access – open access to research outputs can provide greater access to the literature for a researcher as well as enabling a greater number of people to access their research outputs. While this is an important driver, to some extent it is a result of some of the other drivers.

Collaboration (communities of practice) – a well managed research management system could help support researchers in finding suitable people to collaborate with and support the identification of communities of practice. This is an area where research management systems could link effectively with virtual research environments.

Knowledge exchange – having details of an institutions research on an easy to use website could help with knowledge exchange with business and with other nations.

In thinking about ways to address these drivers it is important to focus on the key reasons that an institution needs to implement a research management system. There is a danger that focusing too closely on one specific driver could produce a system that is only good for that particular purpose and does not meet the wider needs of the institution. This is especially serious when thinking about the REF as specifying a solution too closely aligned to the ref may produce a system that is not suitable for future research assessment purposes.

While it was clear from the discussion that the impetus for the development or revamp of research management in institutions was coming from senior managers, it was also clear that it was members of the research office, library, and IT departments of institutions that were steering the specific nature of the implementation in each institution.

Responding to Drivers
Once the drivers were identified, the group moved on to discussing how the drivers could be addressed and what tasks were important in setting up a research management system. To help structure this session and to ensure that the tasks were grounded in the reality of the institutional setting we categorised each task into three cost categories: tasks that would not require extra funding and could be accomplished with existing resources, tasks that would cost a moderate amount of money (e.g. £10,000-£50,000) or tasks that would cost in excess of £100,000.

No cost tasks

Building relationships – it was clear from the whole day that building effective relationships was a key success criteria in developing a research management system in an institution. Effective relationships between senior managers, researchers, research managers, librarians, IT, and other relevant systems are an essential early task that can be achieved without any extra resources. However, maintaining those relationships may take a lot of time and effort and therefore may need some extra resources.

Embedding the system in the institutional processes – to ensure successful uptake of any system, a number of people suggested that the system needed to be embedded in the institutional processes that affect researchers such as assessment and promotion. The group disagreed on whether this was a no cost or moderate cost task with some people feeling that the relationship building and advocacy/training that would be required would push this into the moderate cost bracket. However it was also noted that once the initial hump of getting the system embedded into institutional practice was surmounted then it could make complying with institutional requirements easier and quicker for researchers and therefore lower institutional costs.

Moderate cost tasks

Planning – obviously there is a fairly large planning overhead for implementing a research management system in an institution. This often involves a range of staff and is quite time consuming and so comes at a cost to an institution.
Publicity and advocacy – it is highly likely that any new research management system would require researchers to change their working practices, therefore significant advocacy and publicity would be required to make sure researchers were aware of the system and how it would affect and help them. This is a resource intensive process in terms of staff effort and some materials costs therefore it would require a moderate amount of resources dedicated to it.

Training – a related task to publicity and advocacy is training of researchers and administrators in the use of the system.

Understanding institutional requirements and systems – before an effective research management system can be designed a good understanding of institutional requirements, systems, existing processes and people involved must be developed. This will involve a range of departments and roles and could be quite time consuming but it is an essential step in designing a system that will fulfil institutional requirements.

User consultation – just as it’s important to understand institutional requirements and systems it is also vital to understand the needs and current practices of the people who will end up using this system. This is important in making sure the system meets their needs but it is also important in getting early buy in from users and in managing their expectations. This is a very important part of the planning and implementation process and the group concurred that this was worth dedicating a decent amount of resources to.

Developer time – this is essential if institutions choose to build a home grown system However it is also important if institutions choose to buy a system in as developer time will be needed to ensure the system links well with other institutional systems. This doesn’t come cheap and can be a significant commitment. One group member reported that they had been told that their research management system would require 400 hours of developer time, which would probably push it into the high cost bracket.

Data entry and quality checking – It is important not to underestimate the cost of data entry into the new system, both in terms of set up and in terms of ongoing cost. Even if data is bought in or cheap data entry effort is procured then there will still be an associated cost in quality checking that needs to be supported.

High cost tasks

This category was slightly more speculative than the others as many people in the group did not expect to receive high levels of funding.

Build systems – A number of people believed that this amount of money would enable their institution to build a system that could give their institution competitive advantage over rival institutions. However a note of caution was sounded here in that there may not be a competitive advantage in building your own system and building your own system may unnecessarily duplicate effort occurring in other institutions and in fact there may be advantages to collaborating with other institutions to build an open source system. Competitive advantage is more likely to be realised through the effective embedding of the system and the way it is used rather than building a unique system.

Best of breed products – given this amount of money a number of people suggested the best way it could be used was to buy best of breed products.

Staff – getting the staffing resource correct for any research management system was identified as a key success criteria and a concern for many of the group members. They were concerned with ensuring that the right staff were employed to implement a system and that those staff were then sustained by the institution where required.

An institutional scale data review - this was a scaled up version of the institutional requirements task mentioned under the moderate costs heading. Many group members felt that a really thorough review of an institutional requirement, the data that would be managed by any system and the requirements for managing that data was a step they would ideally like to take before designing a system. Many felt that CERIF could help here.

Action plan
The final part of the session was spent discussing a possible action plan. The following headings were as far as we got. They are listed in chronological order:
1. Relationships – build relationships with all relevant stakeholders.

2. Feasibility – understand the system’s users, the high level requirements for the system and identify a rough cost. (

3. Define institutional need and sustainability and get buy in from senior managers.

4. Produce a plan

5. Consult with users to gather requirements (this would need to start with a stakeholder analysis)

6. Analyse requirements gathered and report back to users with outline specification (it is probably desirable to make this process iterative and to continue the iterations throughout the building process).

7. Produce specification

8. Decide how to proceed and then move to tender or building process

9. Build it

10. Embed it (this process really started with the user consultation and needs to continue throughout the project). This will include training where appropriate.
11. Communication - this is likely to run throughout the project and have two processes:
a. Communicating over the tasks in the project with the relevant stakeholders
b. Wider dissemination and communication related to embedding the system through advocacy, traning etc.

12. Sustainability handover – this needs to include:
a. Built in review process for the software (perhaps every 4 years)
b. Ongoing support including technical and managerial.

Wednesday, 12 May 2010

CRIS event blog write-ups

Richard Jones, Head of Repository Systems at Symplectic Ltd. and a member of the JISC Sonex working group has created two blog posts about our JISC/ARMA Repositories and CRIS event 'Learning How to Play Nicely' held at the Rose Bowl, Leeds Met University last Friday 7th May. Richard attended the event as an exhibitor of Symplectic.

Tuesday, 11 May 2010

Learning How to Play Nicely- Presentations online

Many thanks to our delegates, speakers and exhibitors for making last Friday's (7th May) JISC/ ARMA Repositories and CRIS event 'Learning how to play nicely' such a success.

For those unable to attend the event at the Rose Bowl, Leeds Metropolitan University, or for those who would like to recap, the presentations from the day along with some recorded sessions are now available from our website at

Further outputs from the day will be made available shortly- Watch this space!

Monday, 10 May 2010

Gregynog Repositories Stream - Programme now available

We have now announced the detailed programme for the forthcoming repositories stream at the 2010 Gregynog Colloquium. As you will see we have a detailed programme in place with plenty of variety on offer.

Tuesday 8th June 2010

15.30-17.00 WRN Business Meeting

Wednesday 9th June 2010

9.15 - 10.00 The power of mandates, Sue Hodges, University of Salford

10.00 - 10.30 Publications Management System at Swansea University - Alex Roberts, Swansea University

10.30 - 11.00 Research Management System at the University of Glamorgan - Leanne Beevers & Neil Williams, Glamorgan University

11.00 - 11.30 Tea

11.30 - 12.00 Developing a repository, caring, sharing and living the dream – Misha Jepson, Glyndwr University

12.00 - 12.30 Encouraging Author self – deposit at Cardiff - Tracey Andrews & Scott Hill, Cardiff University

12.30 - 13.00 Using statistics as an advocacy tool Nicky Cashman, Aberystwyth University

13.00 - 14.00 Lunch

2.00 - 2.30 Repository Advocacy: The theory - WRN staff

2.30 - 3.30 Advocacy Café Society session

3 tables will be laid out each with a facilitator and a topic to discuss, participants are moved on to a new topic every 15 minutes with a 15 minute slot at the end to feedback and present findings. Suggested topics:
A)What are the main obstacles to gathering content in your repository?
B)What are the main misconceptions your stakeholders have when it comes to your repository?
C)Put yourself in the shoes of an objector and outline the main arguments against having a repository?

3.30 - 4.00 Tea

4.00 - 5.00 Advocacy in Action: Workshop/exercise. Participants are asked to work in groups to produce some broad brush repository promotional materials.

As in previous years the WRN will be sponsoring places at the colloquium for up to 2 participants per partner institution. Further details have been sent out via the usual mailing list.

We looking forward to seeing you there!

New article published

Earlier this year I was asked to write an article for Program: electronic library and information systems about the ongoing work of the Welsh Repository Network. I am pleased to say that this has now been published:

Knowles, J. (2010), Collaboration nation: the building of the Welsh Repository Network, Program: electronic library and information systems, 44(2), 98-108

Link to published article

Link to final author version in Cadair

Happy reading!

Wednesday, 21 April 2010

Preservation for Repository Practitioners

Aston Business School Birmingham, Thursday 27th May 2010.

In conjunction with the Repositories Support Project (RSP) and the Enhancing Repository Infrastructure in Scotland project (ERIS), we here at WRN are organising a free, one- day workshop at the Aston Business School Conference Centre Birmingham on Thursday 27th May, looking at preservation issues and repositories.

We have created a hands-on, practical programme with preservation tool presentations from the Digital Curation Centre (DCC) and the PLANETS project as well as facilitated discussion sessions looking in to preservation issues and your repository, and how to construct an action plan and preservation policy to use in your institution.

For a draft programme and booking please see the RSP event page.

Tuesday, 20 April 2010

Glyndŵr score a century!

GURO- Glyndŵr University's research repository can now boast over 100 items!

GURO's content has increased by 1350% over the last 6 months bringing GURO's grand total of items to 116. Over half of these items are also full text.

Keep up the good work Glyndŵr!

If you would like to find out more about GURO please contact Misha Jepson, Repository Administrator at

Wednesday, 7 April 2010

Mendeley - Organize research, collaborate, and discover new knowledge

I recently attended Dev8D, an event funded by JISC with the aim of bringing together developers from higher education and other sectors in order to learn from one another and ultimately create better, smarter technology for research.

Part of the event included Expert Sessions where latest developments and solutions were presented. Amongst these was an overview of Mendeley - a free research management tool for desktop and web that has been described as a fusion of and iTunes for research papers!

Mendeley allows researchers to manage their libraries by automatically extracting metadata from researcher papers (in PDF format) which can then be used to create citations and bibliographies. Full-text searches are also supported and users can 'mark up' documents with comments on specific sections. But the real power of Mendeley lies in the social networking features and collective data gathered from users. Groups of like minded researchers can be created for sharing and collaboratively tagging and annotating research papers. The service also provides statistics about research papers, authors and topics allowing users to get recommendations and explore research trends.

The growth in users has been staggering since its release in 2008 - currently there are approximately 8000 institutions using Mendeley with 22,000 research groups collaborating and over 18,000,000 documents in people's libraries! It is quickly becoming one of the largest academic databases around and with funding recently secured from JISC and Europe, Mendeley propose to allow institutions to harvest documents deposited by their academics.

If you'd like to take a closer look, you can sign up and download Mendeley for free.

Wednesday, 24 March 2010

A blog about a blog

Valerie McCutcheon, Operations Manager in Research and Enterprise, University of Glasgow has begun a new blog 'Research Outcomes: Managing Resaerch Outputs and Impact.'

This blog aims to "tell the world what we are doing at the University of Glasgow re Research Outputs and specifically RCUK requirements." Valerie is updating the blog regularly and it provides an insight into developments in this area with RCUK.

JorumOpen: OA learning and teaching resource repository

Based on DSpace software, Jorum now offers an open access learning and teaching resource repository JorumOpen. This new service allows access to resources licensed under Creative Commons, free to anyone, worldwide. JorumOpen compliments Jorum’s original service JorumUK which although free to use by members of the UK HE and FE communities, required an institutional subscription to access and deposit resources.

Having created a number of learning objects for the repository community, the WRN thought it would be apposite, to aid further distribution, to deposit these into JorumOpen. We had already deposited them into our institutional repository CADAIR. However, when searching for the best collection within JorumOpen to deposit them in, we made a discovery- one had already been deposited! Unfortunately, several elements of the metadata record were incorrect including the depositor passing themselves off as the publisher. In fairness to JorumOpen they were extremely cooperative in trying to amend the record, eventually taking it down so that I could create a new record for the item. Evidence of a take-down policy in action!

Item records for all three of our current learning objects are now available within JorumOpen. The registration process for deposit was simple and live deposit was instant. This is a good service to recommend to any keen individuals within your institution who wish to make any of their learning objects available to the wide world if your current IR collection policy does not include resources of this type.

Tuesday, 16 March 2010

Learning how to play nicely: Repositories and CRIS event - now open for booking

The WRN team are pleased to announce the following forthcoming event:

Learning how to play nicely: Repositories and CRIS (Current Research Information Systems)
Date: Friday 7th May, 2010
Venue: Rose Bowl, Leeds Metropolitan University
Cost: Free

For a draft programme, more information and a link to the booking form please visit our web site at

Friday, 26 February 2010

University OA Policy JISC Report

Report title: ‘Modelling scholarly communication options: costs and benefits for universities’

JISC have just released a commissioned report looking at how to build a business case for an Open Access policy within universities. The report, authored by Alma Swan, ‘is based on different types of university. It shows how universities might reduce costs, how they can calculate these savings and their greater contribution to society by following an Open Access route.’

'Neil Jacobs, programme manager at JISC said, “This is the first time that universities will have a method and practical examples from which to build a business case for Open Access and to calculate the cost to them of the scholarly communications process.” '

The report is available to download from the JISC repository at

Other useful resources from JISC for those institutions who are considering the options of OA publishing and an OA publishing policy are:

(Information from: O’Brien, R. (2010). News Release: How to build a business case for an Open Access policy. Message posted to JISC-ANNOUNCE electronic mailing list, archived at )

Wednesday, 24 February 2010

New WRN Learning Objects

The WRN are pleased to announce the launch of their first learning objects focussing on metadata and repositories, given a sneak preview at last Friday’s UKCoRR Meeting:

We are aiming to create a suite of learning objects looking at metadata use with different repository item types so look out for announcements of further resources available soon.

Already available via the WRN website is the first of the WRN learning objects:

‘Multimedia Deposits: Complications and Considerations with Intellectual Property Rights.’

We are looking for feedback on these learning objects to aid us with the design and content of future resources. An online survey has been created for the evaluation of each of the learning objects above, the link to which can be found within the last page of the object itself.

UKCoRR Meeting- 19th February 2010

Venue: University of Leicester

On Friday 19th I attended the UKCoRR Meeting hosted by the University of Leicester at their VERY impressive David Wilson Library (a clear picture of what can be done if you have £32 million available!). I had been invited by the UKCoRR Committee to speak about the work of the WRN and more specifically about the tools we have created (learning objects) and the services we are looking to offer (NLW e-theses harvesting; events). A copy of the presentation is available from CADAIR.

The meeting itself boasted a full day of presentations from members and also offered a great opportunity for networking with others in the repository community- especially those with hands-on, practical experience of repository issues.

The day opened with a Welcome address from Louise Jones, Director of Library Services who provided highlights of the achievements and future plans for the repository at Leicester:

  • mandates for both e-theses and all academic research outputs;
  • Research Information Management System bid in conjunction with the University’s Research Office;
  • hiring of a Bibliometrician to aid with REF/ research reporting;
  • plans for an Open Educational Resources repository- named OTA I think (?).

This was followed by presentations from Jenny Delasalle, UKCoRR Chair and Dr. Nicky Cashman, UKCoRR Secretary (and AU Repository Advisor). Nicky talked about her experiences as a Repository Advisor so far and highlighted the current ‘Opt-in’ repository deposit aspect of AU’s e-theses submission mandate and how this may conflict with EThOS digitisation requests in the future. This prompted a small discussion about how e-theses mandates had been handled in other institutions. At Leicester, permission has to be sought from past students before a thesis is made available to EThOS for digitisation. This is similar to the situation in Southampton where students have had to be contacted through the Alumni Office before their already digitised theses can be made available via the repository. Another institution uses the Freedom of Information Act to fall back on if a previously embargoed thesis is subsequently requested by EThOS for digitisation.

The next presentation came from Nick Sheppard and Wendy Luker at Leeds Met University about their recently completed Bibliosight project. The project was looking at streamlining the method for populating repositories using metadata from WoK’s WSLite API. The code developed by the project is available as a JAR file. A query to WoK will return an xml page of results which can then by converted to xslt where extra fields can then be added. These results can then be deposited into a repository via SWORD. There are highlighted problems with the API however:

  • only certain fields within the records are returned, abstracts are not included as WoK are not able to grant a license for their transfer;
  • it is not possible to distinguish between the publication type of the items returned;
  • a limit of 100 records return per query. If more records are found a second query specifically requesting records 101-x/200 has to be made.
The attempt at a live demonstration on the day also highlighted that IP authentication may prove a problem when using the API. It was unclear whether this was from WoK’s end or something within the developed code. The mechanisms of how to populate a repository with WoK records has been the focus of the Bibliosight project rather than the management issues surrounding it so the copyright implications related to data re-use have yet to be considered. Please see Nick’s blog post about the meeting and to view his presentation.

Gareth Johnson, our host at the University of Leicester, gave a very entertaining presentation about his experiences as University of Leicester Repository Manager; a copy of his presentation is available from SlideShare. An interesting anecdote he raised in his presentation related to commercial bodies’ use of an institution’s repository and its content for vetting researchers. If a commercial body is looking to approach an academic to collaborate with them in a research project the availability of that academic’s full-text gives them an insight into the quality of research being produced by that individual. A useful element to include in any repository advocacy! Gareth has also created a useful commentary of the meeting as it happened available from the UoL Library Blog.

Another useful presentation came from Jane Smith and Peter Millington from Nottingham looking at the additions that have been made to SHERPA RoMEO and its cross-over with SHERPA JULIET. Jane highlighted that although one of the new additions to RoMEO was an ‘Updated on’ field to records they still did not have the capability to display all past versions of a publisher’s open access policy. They do however, store paper copies of each incarnation of a policy they are aware of and copies can be made available on request to

Hopefully all the presentations from the day will be available via the UKCoRR website.

Please see the UKCoRR membership pages for info on how to join.

Tuesday, 16 February 2010

Heading in the right direction - new statistics now in!

I have finished processing the latest batch of statistics from our project partners and I am happy to report we have seen an improvement in the total growth rates for our repositories for the three months October through to December 2009.

Growth rate Jul - Sep 2009 = 12.47%
Growth rate Oct - Dec 2009 = 13.84%

The figure we use to calculate these rates is the total number of items appearing in each of the Welsh repositories. Detailed figures are as follows:

Keep up the good work everyone!

Wednesday, 27 January 2010

Digital Preservation Roadshow, Aberystwyth

Last Friday Jackie and I presented at the CyMAL / Society of Archivists Digital Preservation Roadshow held at the National Library of Wales (NLW) . The event was aimed at all practitioners who were involved with the management of digital records, with delegates coming from different organisations within the public sector including Welsh regional archive services, Universities and the NLW.

Our presentation looked jointly at the approaches taken to digital preservation within the repository community and how preservation is being put into action within our e-theses harvesting workpackage, being conducted with the NLW. The presentation that followed ours was delivered by our NLW partner, Glen Robson, who discussed what was going to happen to the harvested e-theses records once they had reached the NLW repository or DAMS (digital asset management system) as they call it.

Other interesting presentations during the day included:
  • A forecast of the necessary skills needed by future practitioners to manage and preserve digital records, given by Kirsten Ferguson-Boucher, Records Management Lecturer at the Department of Information Studies, Aberystwyth University.

  • An entertaining and engaging presentation on METS and other standards including PREMIS by Lyn Lewis Dafis, Head of Metadata and Digitisation Unit, NLW. Actually making these complicated metadata standards understandable for the non-techies like me!

  • An overview of the digital preservation policy planning at Cardiff University by Sarah Phillips, Records Manager.

  • Short description of a useful tool to assess file format suitability in terms of preservation developed by the National Library of the Netherlands and utilised by the NLW from Ioan Isaac-Richards.

The Society of Archivists hopes to have the presentations for the day available via the Digital Preservation Roadshows 2009- 10 webpage soon.

Tuesday, 26 January 2010

Now planning - Repositories and CRIS event

The WRN team are currently in the process of planning a repositories and CRIS event. This one day event will be one of the programme meetings for the JISC Inf11 programme and will explore the close relationship between CRIS (Current Research Information Systems) and institutional repositories. We hope to be able to co-host this meeting with representatives from ARMA the professional association for research managers and administrators in the UK.

The programme and details are at an early stage and we are inviting feedback from our community to help us shape the event. We currently have an online discussion page available at

Please join in and let us know what you think!