Kehan Harman – https://archive.gaiaresources.com.au Environmental Technology Consultants Thu, 29 Feb 2024 03:47:38 +0000 en-AU hourly 1 https://wordpress.org/?v=4.9.1 Implementing the Drupal aGov distribution at the Forest Products Commission https://archive.gaiaresources.com.au/forest-products-commission-website-new-implementation-drupal-agov-distribution/ Thu, 05 May 2016 05:13:19 +0000 https://archive.gaiaresources.com.au/?p=3525 Last week, the Forest Products Commission (FPC) launched their new web site, which we had a hand in developing. Our project was to assist the FPC in updating their website into a modern content management system. We proposed that they use the aGov Drupal distribution as it implements many of the state and national requirements in terms of... Continue reading →

The post Implementing the Drupal aGov distribution at the Forest Products Commission appeared first on Gaia Resources.

]]>
Screenshot of a website

Last week, the Forest Products Commission (FPC) launched their new web site, which we had a hand in developing.

Our project was to assist the FPC in updating their website into a modern content management system. We proposed that they use the aGov Drupal distribution as it implements many of the state and national requirements in terms of security, metadata and accessibility, meaning that we were able to focus on custom content, custom webforms, themeing and supporting their team around the launch.

We used some interesting technologies such as the Behaviour Driven Development testing framework, Behat, to write user interface tests to ensure that changes made through the Drupal UI don’t break desired functionality.

Custom content types

We set up a number of custom Drupal content types and associated views to enable our client to manage their content appropriately. These include:

Custom webforms

We used the powerful webform module to build several forms to capture feedback from the public. The advantage of using a flexible module like webform makes it possible to build fairly complex forms with dependency between fields. A good example of this is the feedback and complaints form.

Themeing

The theme for the new site uses Responsive Web Design, enabling content to be viewed clearly on both mobile devices and desktop computers. The theme is a subtheme of the excellent zen Drupal theme which aims at producing sensible, clean, semantically meaningful markup.

Screenshot of the original FPC website.

Screenshot of the original FPC website.

 

The website went live last week, and we will be undertaking a few final tasks on the FPC intranet site to help with their internal data management.  It’s been a pleasure to work with the team to get to the launch last week – if you have any queries about this project feel free to leave a comment below, or start a conversation with us via Facebook, Twitter or LinkedIn.

Kehan

The post Implementing the Drupal aGov distribution at the Forest Products Commission appeared first on Gaia Resources.

]]>
Adding Australian National Species List names to CollectiveAccess https://archive.gaiaresources.com.au/adding-australian-national-species-list-names-collectiveaccess/ Wed, 04 Nov 2015 00:43:54 +0000 http://archive.gaiaresources.com.au/?p=3108 We recently added a plugin to CollectiveAccess to reference names from the Australian National Species Lists.

The post Adding Australian National Species List names to CollectiveAccess appeared first on Gaia Resources.

]]>
Recently during our pilot project helping CSIRO evaluate CollectiveAccess as a candidate for the National Collections we needed to show that we could use an external nomenclator within CollectiveAccess. Luckily CA comes with a generic ‘InformationServiceattribute (field) type which lets you reference an external web service and index and display content from it within your collection management system. Up until now the services available to reference were:

While uBio has got good global coverage it lacks a lot of the Australian names and unfortunately the uBio service is currently down. There has already been significant work compiling taxonomic names for Australia embodied by:

Under the auspices of the Atlas of Living Australia (ALA) there has been work to integrate the APC and the AFD into a single National Species List. I initially tried integrating with the web service referenced there, but I failed to implement a fast enough autocomplete using those services (and who wants to sit around waiting for an autocomplete?). Then in my travels around the interwebs I discovered that the Australian National Species Lists (NSL) are currently evolving into a fully RESTful framework that has a good API that is well documented. Currently, only vascular plant names are included in this framework but I have it on good authority that fungi, algae, mosses, lichen and the AFD are on the way too (see the roadmap here).

In order to add a new InformationService attribute type to CA I needed to implement the IWLPlugInformationService interface, which proved relatively straightforward. I implemented the following methods in my new class to look up the data from the API:

  • lookup() – provides the autocomplete
  • getDataForSearchIndexing() – adds additional data to the search index
  • getExtendedInformation() – Adds a view of the referenced data – this hooks into the
  • getExtraInfo() – adds additional data to store serialized with the attribute (field) value

Configuring the NSL (and APNI) attribute type

Configuring a field in CollectiveAccess to reference the NSL. You can choose which additional fields you want to add to the search index and also what format the information is presented in.

Configuring a field in CollectiveAccess to reference the NSL. You can choose which additional fields you want to add to the search index and also what format the information is presented in.

Autocomplete for scientific names

The autocomplete for scientific names. The NSL web service is responsive and provides just enough information so this autocomplete doesn't leave the user frustrated.

The autocomplete for scientific names. The NSL web service is responsive and provides just enough information so this autocomplete doesn’t leave the user frustrated.

 

APC Format embedded information

Name record viewed in APC format

Name record viewed in APC format

 

APNI Format embedded information

Because other opinions are also available:

Name record viewed in APNI format

Name record viewed in APNI format

We have already contacted the developers of this new API to make some suggestions of possible enhancements especially relating to the embedded data format that they currently output.

Doing this development is only one part of the cycle – CollectiveAccess is an Open Source project so now that we have implemented this locally we can give a little back. Thanks to Seth for accepting our pull request adding this new feature which will become available in the 1.6 version of CollectiveAccess.

Now that we’ve seen how easy it is to reference external authoritative data sources within CollectiveAccess watch this space for further development. We’re also looking forward to having more data  in the NSL and enhanced functionality in its API.

Kehan

Leave me a comment below, or start a conversation with us on Facebook, Twitter or LinkedIn.

The post Adding Australian National Species List names to CollectiveAccess appeared first on Gaia Resources.

]]>
CollectiveAccess – powerful, flexible collection management https://archive.gaiaresources.com.au/collectiveaccess-powerful-flexible-collection-management/ Tue, 21 Jan 2014 07:08:31 +0000 http://archive.gaiaresources.com.au/wordpress/?p=2109 We’ve previously posted about Open Source Collection Management and our work with the Western Australian Museum (WAM) including the Waminals Project. This is a quick post about our experiences with CollectiveAccess, the system chosen by the WAM as their institutional data repository, which we’ve been implementing within the Museum. CollectiveAccess (CA) is an open source... Continue reading →

The post CollectiveAccess – powerful, flexible collection management appeared first on Gaia Resources.

]]>
CollectiveAccess - Flexible Collection Management Software

We’ve previously posted about Open Source Collection Management and our work with the Western Australian Museum (WAM) including the Waminals Project. This is a quick post about our experiences with CollectiveAccess, the system chosen by the WAM as their institutional data repository, which we’ve been implementing within the Museum.

CollectiveAccess (CA) is an open source data management platform that has come out of the Museum informatics community. It was known as OpenCollection in its early days but really found its footing when it became known as CollectiveAccess.

Flexible Software

Like a number of the other tools that we work with and develop, CollectiveAccess gives the user / system administrator the ability to define their own data schema and data entry forms. Where it is particularly suited to the Western Australian Museum is its support for revision tracking, media management, built in collection management modules including conservation and loan tracking. While the system is extremely configurable it also comes with sensible defaults and supports a number of metadata standards including DublinCore, DarwinCore and ISAD(G) to name a few through installation profiles. In addition to this since starting work on this project CA has added a number of great features including data importers and exporters, Bulk (spreadsheet style) and Batch (set same value on multiple records) editing and a RESTful web services API.

CollectiveAccess itself has a number of loosely joined components. The main basis of it is Providence which is the data management and entry backend. Providence has a number of optional plugins and it is possible to enhance its features towards individual organisations through the development and customisation of plugins. On top of that it is possible to install and theme up a public frontend to CA using Pawtucket. There are also iOS and map based viewers of CA data available along with modules for other software such as Drupal to incorporate the data into institutional websites.

Community Software

The object details screen in CollectiveAccess

The object details screen in CollectiveAccess

As can be seen by the number and diversity of institutions reported as running CA, it has a very broad user base and is supported by a number of institutions, private development agencies and individuals. This has meant that it has managed to capture a lot of the requirements of institutional collections and archives. Having the backing of a number of commercial agencies has meant that modern software development practices and techniques are now used. As we have noted elsewhere, open source does not mean that companies cannot both contribute and benefit from shared projects. There is a great community forum that is open to both novices and more advanced users and this active community. On top of this there is a well managed Wiki that contains all of the documentation relating to setting up, maintaining and developing CA instances. The lead developer, Seth Kaufman, has an in depth knowledge of the system and is ready and able to support users and developers with their queries, but he is certainly not alone.

Powerful Software

On top of the ability to define custom forms and fields, CollectiveAccess also gives the user the ability to define custom relationships between the native and custom object types, and it is possible to add fields to these relationships so we have been able to collection information such as the details of the person / people who identified a particular specimen on a particular date. If these relationships were just plain text fields you wouldn’t be able to do things like then browsing from that specimen to the person who identified it and then from there on to the list of specimens that that person has identified or collected and from there on to the collecting localities that they have visited.

The data browser in CollectiveAccess is paired with a powerful full text search index that allows the user to facet their data in any way that they like. On top of that it is possible to easily export the results of searches and present them in any number of formats including PDF, CSV, thumbnails and labels ready for printing.

Give Back

One of the most gratifying things about working on Open Source Software is the fact that you are free to modify the software to your needs and more importantly you can (and indeed are sometimes required to) contribute those changes back to the software development community. We have experienced this on numerous occasions whereby we started out finding bugs in the software and reporting them to the CollectiveAccess issue tracker and fixes for those bugs would land within a few hours in the CollectiveAccess GitHub respository. As we became more familiar with the software have been able to contribute features and bug fixes via GitHub pull requests. If you are planning to contribute code to any open source project it is generally a good idea to make sure you record details as to the reasons for the changes and sensible descriptions of those changes on the respective project’s issue tracker and CA is no exception to this: they ask that you file feature and bug request in their Jira instance, which we have been doing on a regular basis.

And in a double twist of the usual status quo with open source software, we have recently received a nice tote bag from the core CA developers for our contributions to the project – Thanks 😉

Links to Collective Access material

In summary we have found that CollectiveAccess has given us a lot of the benefits of FOSS (Free and Open Source Software) and enabled us to move forward with our partners with minimum pain and maximum gain. Stay tuned to our blog and we will keep you updated with this project with the Western Australian Museum as the project progresses.

The post CollectiveAccess – powerful, flexible collection management appeared first on Gaia Resources.

]]>
ESA 2012 – AKA Ecology, Coffee, Melbourne https://archive.gaiaresources.com.au/esa-2012-aka-ecology-coffee-melbourne/ https://archive.gaiaresources.com.au/esa-2012-aka-ecology-coffee-melbourne/#comments Tue, 18 Dec 2012 02:27:50 +0000 http://archive.gaiaresources.com.au/wordpress/?p=1926 This year we again sponsored the ESA (Ecological Society of Australia) Annual Conference and I got to go along to support the mobile apps that we wrote as part of our sponsorship. Like the last conference that Piers and I attended a couple of years ago this was again an awe inspiring conference, although the community seems to have... Continue reading →

The post ESA 2012 – AKA Ecology, Coffee, Melbourne appeared first on Gaia Resources.

]]>
This year we again sponsored the ESA (Ecological Society of Australia) Annual Conference and I got to go along to support the mobile apps that we wrote as part of our sponsorship. Like the last conference that Piers and I attended a couple of years ago this was again an awe inspiring conference, although the community seems to have taken several technological leaps forward. Two years ago Piers and I were pretty much the only people tweeting at the conference but last week there were over 540 tweets from at least 95 twitter users using the hashtag #ESAus2012. One of the presenters even tweeted his talk at the same time as he was presenting it, although the ephemeral nature of Twitter seems to have swallowed that up. I have archived the tweets from the conference if anybody is interested.

Ian Lunt, the opening keynote presenter for the conference seems to have wrapped it up pretty well on his blog but I have several personal take home messages that I developed throughout the week. I took copious notes of varying degrees of quality in all of the talks that I went to, experimenting with Evernote on my Android phone before finally reverting to the reliable company eeepc – there’s something reassuring about having a keyboard that you don’t have to look at while you’re typing. Ian Lunt’s keynote seemed to actually set the theme for the whole conference: we have to always be aware that the work we’re doing needs a full context.

The second keynote from Emma Johnston on stress ecology was a bit of an eye opener in terms of the impact that humanity is having on the planet  – there are few if any locations on Earth where there is no pressure from anthropogenic chemicals! Damaging levels of herbicides floating around the Great Barrier Reef – and we wonder why coral reefs are disappearing? Refreshingly though her lab seems to be getting in on ecosystem level ‘omics which really is a leap forward for ecology – in a past life I did some work looking at something called the ‘Interactome’ which is essentially a network model of how organisms function from the ground up and I remember thinking – I wish we could  do this for whole ecological communities – pipe dreams do pay off in the end I guess.

Anyway I’m not going to recap the full conference, as there are abstracts online on the site that we built to support the mobile app and host the data. Plus I have a list of all of the talks I attended as a byproduct of selecting them in the session planner for our mobile app.

The poster sessions were both very successful due to the ingenious idea of giving the presenters an almost godlike power: they could give out free drink tokens to people who asked sufficiently intelligent questions. Some other nice features about the conference were that the catering was all vegetarian and there was barista coffee sponsored by the Centre for Integrative Ecology.

Conference Mobile App – How did it hold out

One of the objectives of me being at the conference was to act as ‘on the ground support’ should the mobile app not work for somebody. To my amazement pretty much all of the feedback that I got from everybody that I accosted was that the app was working well. It was nice to hear the conference logistics staff saying that a number of people had turned down the 1kg tome that was the conference proceedings as they ‘already have the App for that’ – maybe this means that next year far fewer pages will be printed out. We did find a few bugs and those have all been documented so that when we do this again the app will be fixed (anybody on an iPhone notice that you couldn’t scroll to the bottom of the abstract without it bouncing back?). Once nice feature of having a Drupal backend for the mobile app was that I could change some features of the app during the conference:

esa2012wrdl_1

Towards the end of the week there was even a tag team talk by two Melburnian professors who seemed to only be able to agree on one thing – they both hate Essendon. Actually a good fun rant on Popperian Hypotheses vs Bayesian Poterior Probablity, and even included the great quote: “ecologists use statistics like a drunken man uses lamp posts–for support rather than illumination“. My final great revelation of the week was the fact that “Sugargliders are Jerks” according to Dejan Sojanovic who has been studying the demise of the swift parrot (Lathamus discolor). He won last year’s Jill Landsberg scholarship which is given out by the ESA and he came to present the results of his research. These cute parrots breed in tree hollows in Tasmania. With the help of remote cameras he managed to catch the gliders in the act of eating helpless chicks. An apparently Sugargliders were introduced into Tasmania just under a century ago from mainland Australia.

So all in all a great academic conference that benefitted from professional organisation and conference hosting. I’d just like to finish off with a request to our readers / the people who used the mobile app at the conference:

 

on_the_ground

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Does you have any suggestions as to how we could further develop the conference application and associated website? Twitter integration? Voting on talks? Uploads of slides? Lolcats?

Contact me on email or start a conversation on Twitter.

The post ESA 2012 – AKA Ecology, Coffee, Melbourne appeared first on Gaia Resources.

]]>
https://archive.gaiaresources.com.au/esa-2012-aka-ecology-coffee-melbourne/feed/ 1
Waminals – Species Databases, Mungeing, Maps and Drupal https://archive.gaiaresources.com.au/waminals-species-databases-mungeing-maps-and-drupal/ Wed, 03 Oct 2012 02:35:22 +0000 http://archive.gaiaresources.com.au/wordpress/?p=1934 Last night I gave a presentation at the Drupal WA Meetup on a recent project we’ve collaborated on with the Western Australian Museum. The aim of the project was to make up to date species identification information available to field workers. I’d like to give a little background about the project and some details as... Continue reading →

The post Waminals – Species Databases, Mungeing, Maps and Drupal appeared first on Gaia Resources.

]]>
Last night I gave a presentation at the Drupal WA Meetup on a recent project we’ve collaborated on with the Western Australian Museum. The aim of the project was to make up to date species identification information available to field workers. I’d like to give a little background about the project and some details as to how we went about integrating the Museum’s collections databases with a Drupal frontend to give environmental consultants a dynamic identification aid.waminals

Collections and Databases

 

Over the years we have worked with the Western Australian Museum (WAM) on a number of projects including database and GIS support. They have been a great client to work with and we have learned a lot from eachother. Over the past few months I’ve been spending a couple of days a week as an embedded developer at the Museum working on a project to ‘Integrate the collections databases with the Museum’s online catalogues’. After a few days of discussions with stakeholders and getting myself setup with the Museum’s development framework of Drupal on a fairly beefy LAMP server I set out on the project that became affectionately known as ‘Waminals’. Each collection at the Museum has it’s collections data housed in a separate database that has been tailored to that particular collection’s needs. Here are a few details about the previous workflow and what we were trying to attain.

 

Previous Workflow

 

The previous workflow was:

 

  • Consultant lodges specimens, collecting data and identifications with the WAM
  • WAM curator then receives specimens
  • Checks identifications
  • Checks taxonomy
    • Assigns ‘putative name’ while taxa are still being described
  • Accessions Specimens
    • Adds the specimen to the museum’s catalogue (database) and stores it in the appropriate location and medium.
  • Gives consultant back a list of accession numbers, collecting information and the correct identification
    • Often these names are not fully published names but rather field names that will eventually get replaced by validly published scientific names
    • The consultant is then ignorant of any subsequent changes to the data

 

Waminals Workflow

 

The Waminals project aimed to enhance this workflow by creating factsheets for taxa that are commonly thought of as new

 

  • Created by the ecological consultants on museum infrastructure
  • Checked by museum curators and approved for ‘publication’
  • Consultant will add
    • descriptive couplets (attributes) of character name and character state
    • Multiple images of specimens which have been registered in the WAM collections
    • Registration numbers for the specimens in the photographs
  • Name details that the consultant has are entered at the same time
  • Factsheet then checked by WAM staff, and when approved it will be published.
  • The System will then periodically
    • Take the registration numbers from the factsheet
    • Query collection databases for names associated with those registration keys
    • Update the name details on the factsheet to match the current name used in the CMIS.
      • These fields should then be locked from further update by the users – updates now come from the database.
      • ‘Putative names’ will eventually be replaced by the correct name when published.
    • Retrieve other specimens that are associated with that name
      • Pull in coordinates & locality information for these specimens
      • Display them in an online map
  • The system should only expose data to authorised users due to the commercial and intellectual property sensitivities of the data.

 

Data Transformations

 

PentahoDue to the heterogeneous database backends at the museum we could not simply expose the collections data to Drupal without an intermediate transformation step. Luckily there are plenty of people out there who have been in the business of transformations a long time and have created great tools such as Google Refine (see my previous blog post on using that) and Pentaho Data Integration. The Museum is currently using Pentaho to mobilise some of their collections data to the Atlas of Living Australia, and it provides the facility to run the transformations server side via a shell script, so the decision to use it was an easy one to make.

 

Data exchange format

 

darwin-core-unofficial-logoSo now we had a tool to do the data transformations for us we then needed to settle on a standard exchange format for the specimen data. Again we didn’t have to look much further than TDWG and their DarwinCore standard as it now supports the lowest common denominator of data exchange formats – CSV.

 

<rant>CSV stands for ‘Character Separated Values’ not ‘Comma Separated Values’.</rant>

 

Data Loading

 

There are a number of ways to expose data to Drupal from quick and dirty hacks to elaborate modules with complex data loading regimes, and we evaluated the Feeds and the Migrate modules. Feeds has a nice UI and you can map your source data to Drupal fields through point and shoot functionality, but it missed a few features that we were looking for. Migrate on the other hand has a fairly simple UI and the mapping of data to Drupal entities is done through extending the base Migration class in PHP code. This meant that we could define fairly complex transformations during data loading. Additionally the Feeds module appears to be suffering from a little bit rot lately with the bug number creeping ever higher.

 

Site Structure

 

Wherever possible we used Drupal modules that exposed their configurations to the excellent Features module which allows you to export and version control your configurations making deployment between staging and production servers much less of a headache than essentially ‘doing the same thing again on a different server’. Additionally all the code developed was pushed to the Museum’s Git repository to make collaboration easier.

 

Factsheets

 

The Museum’s Catalogues Site is where all the data had to end up. The guys at the Museum had already set up a content type to capture the factsheet data in using Drupal’s great Field UI. Species descriptions often take a similar form, but depending on the group you are working with the names of characters and states can vary. The WAM developers came up with a neat solution to this by using hook_field_info() to define a field type that gives you two fields per value, one for the character name and another for the character state. The character name autocompletes from pre-existing values helping to normalise number of characters entered. The nice thing about using Drupal’s field system is that these fields get made available to the Views module so you can start building complex queries on them.

 

attribute_field

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Consultants from collaborating ecological consultancies then filled in the species descriptions and uploaded images throughout the process of the project.

 

Taxonomy

 

The names from the specimens get stored in a Drupal Taxonomy containing additional fields inspired by DarwinCore Taxon level terms.

 

TaxonNameFields

 

Occurrences

 

The specimens were loaded into a Drupal content type that was modelled around the Occurence, Event and Location level terms with references to the Taxon vocabulary within the site.

 

The fields used in the occurrence content type

 

Geospatial Data

 

openlayers_monsterGeospatial data in Drupal has had a fairly complex history and the number of spatial modules is quite overwhelming. Luckily the Geofield module has brought some kind of consistency, embracing the above mentioned Field UI. This means that you can add any number of spatial fields to a Drupal entity, each field with their own settings and input widgets, so you could for example have a lat/lon pair of fields for a specific point and then have a separate draw on web map to define the polygon that you’re interested in (think capital city in a country, or location of specimen within a collecting area). The geofield module uses the geoPHP library (https://github.com/phayes/geoPHP, http://drupal.org/project/geophp) which in turn uses the PHP bindings for the GEOS (Geometry Engine Open Source) if you have them enabled. This means you can perform spatial transformations from within Drupal.

 

Online Maps

 

WebMapAgain there were a few options with respect to how we were going to present the data from the Waminals project, but the OpenLayers project is the one that we have the most experience with and it’s Drupal implementation is fully featured and there are a good number of layer types available for Drupal, either build in or through modules such as the MapBox module.

 

As the data we were presenting was not going to be available to freely registered users on the site due to commercial and intellectual property sensitivity we could not use Google Maps. There are however great datasets with decent imagery for the scale of maps that we were wanting to present including the NASA Blue Marble imagery.

 

Access Control

 

As mentioned above by default the data must not be exposed to unregistered users. This was achieved via the Content Access module which means that if the system gets extended and made publicly available it is simply a matter of specifying the access level of individual occurrences / factsheets and the system will expose them if necessary.

 

Sample Factsheet

 

A sample factsheet for one of the Schizomids

 

Factsheet Browsing

 

We needed a way of browsing between factsheets and finding taxa with similar characteristics and the Views module was the starting point for that. A tabular view looked like what we wanted but we wanted the fields down the left hand side and the individual nodes as a side scrolling table. Again we did a bit of research and the Views Hacks module provides exactly that functionality in the form of a views_flipped_table output style. Some jQuery and CSS magic from the Museum’s designers and we had a nice tablet friendly factsheet browser thingamajig. Plus some careful url crafting meant that it was possible to pass in the view’s input filters in as GET query parameters so drilling down to factsheets with attributes matching the specimen being observed is quite fast and intuitive.

 

factsheetBrowser

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

These factsheets do not replace classical dichotomous keys or interactive keys but the data captured through the collaborative infrastructure of a Drupal website makes a great starting point for creating either of these identification aids should the need arise.

 

Conclusion

 

It was great to be working on site with the client as queries could be answered quickly and changes could be tested as soon as they were implemented. Drupal is a great web framework and has come on leaps and bounds since I first started using it six years ago. Additionally, and more interestingly for us as a company with a fairly strong interest in spatial technology, it is possible to create web maps using Drupal without too many hassles so watch this space.

 

Presentation

 

I have embedded the presentation that I gave below if you would like to take a look:

 

 

 

 

If you have any responses contact me via email or twitter or leave a message below.

 

The post Waminals – Species Databases, Mungeing, Maps and Drupal appeared first on Gaia Resources.

]]>
Importing species profiles into the BDRS with the help of Google Refine https://archive.gaiaresources.com.au/importing-species-profiles-into-the-bdrs-with-the-help-of-google-refine/ Mon, 20 Feb 2012 06:31:44 +0000 http://archive.gaiaresources.com.au/wordpress/?p=1837 For a number of our projects we have to either do data cleaning or transform data from a number of different sources / formats into subtly different formats. While it is possible to write custom (Python, Perl, Java, AWK) scripts to do this sometimes it’s just a little too tedious getting to the bits we’re... Continue reading →

The post Importing species profiles into the BDRS with the help of Google Refine appeared first on Gaia Resources.

]]>
For a number of our projects we have to either do data cleaning or transform data from a number of different sources / formats into subtly different formats. While it is possible to write custom (Python, Perl, Java, AWK) logoscripts to do this sometimes it’s just a little too tedious getting to the bits we’re really interested in. Luckily there’s a great tool that makes it easy to work with messy and / or disparate data sources and work with the data in a spreadsheet style interface that fits the bill in a number of ways: namely Google Refine which is calls itself a ‘power tool for working with messy data’

  • It’s open source so fits in with our corporate policy of using FOSS wherever possible
  • It’s cross platform
  • While it is browser based, It keeps all the data locally so we can ensure that we don’t unwittingly send our clients’ data somewhere where they don’t want it
  • It understands a broad range of file formats and web protocols, but doesn’t limit you to those
  • It has unlimited undo history
  • It has data templating so you can customise you export formats
  • It clusters data intelligently making normalisation / deduplication a doddle
  • It’s extensible – your tranformations can be written in GREL (it’s own expression language), Jython or Clojure
  • It can export in numerous formats and links well with Google Docs and Fusion Tables as well as being compatible with Microsoft Excel and OpenOffice/LibreOffice.

While it looks deceivingly like a spreadsheet it is anything but that. Spreadsheets are great for working with data, doing repeat calculations and making pretty graphs. Refine is great at working with lots of data, doing lots of slightly different transformations,  and understanding the breadth of your data, and is not made for making graphs but you can even do that if you really want to.

I recently used Google Refine to help populate some species (or taxon) pages in the BDRS.  To do this, I used the taxon profile importer in the BDRS, which uses a list of LSIDs to query the ALA, and populate species pages from that content.  This was a function created to help the ALA staff (and us!) help other groups get taxon pages up and running quickly, building on the significant investment that the ALA has made already.

To do that, I knew I could get a list of LSIDs from the ALA’ name services, running at biodiversity.org.au.  I searched for the order Insecta, and used Google Refine to parse the JSON string and export the data in a format that I could then copy and paste directly into the BDRS, and populate the taxon pages.

The screencast is about 4 minutes long, and I’ve added annotations; this is a pretty accurate reflection of the time it would take to do this process, and go from a BDRS with no species pages to one with a whole bunch of them in around 5 minutes.  While this doesn’t flex Google Refine’s muscles very much, it does make use of it’s ability to pull data from a URL, parse JSON and export the resulting data in a specified format using templating.

 

 

If you have any responses contact me via email or twitter or leave a message below.

The post Importing species profiles into the BDRS with the help of Google Refine appeared first on Gaia Resources.

]]>
Bushblitz in Ned’s Corner (VIC) with TRIN https://archive.gaiaresources.com.au/bushblitz-in-neds-corner-vic-with-trin/ Tue, 06 Dec 2011 05:19:49 +0000 http://archive.gaiaresources.com.au/wordpress/?p=1808 The dev team goes bush Bushblitz is a biodiversity discovery partnership between the Australian Government (ABRS), BHP Billiton and Earthwatch Australia, which over the past 18 months has succeeded in discovering over 500 new species on a large number of total biological inventory surveys throughout Australia. Gaia Resources also hosts and maintains the Bushblitz web... Continue reading →

The post Bushblitz in Ned’s Corner (VIC) with TRIN appeared first on Gaia Resources.

]]>
The dev team goes bush

BushBlitzBushblitz is a biodiversity discovery partnership between the Australian Government (ABRS), BHP Billiton and Earthwatch Australia, which over the past 18 months has succeeded in discovering over 500 new species on a large number of total biological inventory surveys throughout Australia. Gaia Resources also hosts and maintains the Bushblitz web site, and has had an active involvement in Bushblitz for some time.

The Bushblitz partnership had been invited on to Ned’s Corner Station to produce a biological inventory. The station is a former sheep range that has been added to the National Reserve Network through funding from the Federal Government and the Trust for Nature.  So, following on from our past field trials, last week AJ, Timo and myself went out on some field work testing the mobile data recording components of the BDRS as past of the Taxonomic Research and Information Network mobile data capture project. They’ve all worked really hard and had a lot of fun working with the cross disciplinary team from a large number of Australia’s biodiversity institutions.

We arrived just before midnight on Sunday after a short flight to Adelaide and then a long drive up to the Northwest corner of Victoria to discover that the Bushblitz logistics team had already put up some tents for us. We soon got the Anabat set up and the data logger on that just started going crazy which gave us a clue that we were in the right place for a biodiversity survey.

Next morning we started early and had chats with various scientists about what we were going to be doing over the next few days. A lot of people were interested in the potential of something that could streamline workflows and the double handling of data and wanted to see it working for themselves. We then double checked the custom forms and survey methodologies (AKA ‘Census Methods’ in the BDRS) with a number of scientists and made some last minute changes to the forms thanks to the flexible capabilities of the BDRS before downloading the form definitions and workflows onto the menagerie of mobile devices that we’d brought with us. Finally after lunch Timo I went out to start doing some real fieldwork. I scratched my botanist itch and went out with Val and Andre from the National Herbarium of Victoria and Dave from the Northern Territory Herbarium. Timo went out with the two Karens from the mammal team at Museum Victoria and AJ stayed in camp to keep an eye on the server and on the new records coming in from the field devices.

2011-11-30 09.11.06

 

 

 

 

 

 

 

 

 

Over the next couple of days we all spent hours in the field helping the enthusiastic scientists record data, set traps, empty traps, catch reptiles, survey birds and get bogged vehicles out of the mud. We learned the ins and outs of every kind of trap on the market including Malaise and Pitfall traps for invertebrates and Elliots, pitfall and snap traps for mammals and reptiles. GPS coordinates and altitude were taken by hand and written down for each collecting locality. What became obvious to us all was that the field work day is a long and busy one, and the work was only half done by dinner time. Most of the scientists then spent the evening identifying, expanding their field notes with additional information and  preparing and processing any specimens they had collected so that they could be accessioned into their respective collections. All of them weren’t looking forward to the task of capturing and cleaning all their data on their return.

Over the course of the three days in the field we managed to log a records in a variety of surveys using a number of different methodologies. We managed to identify several easy to fix issues with the mobile tool that will greatly improve the user experience should they all be successfully implemented.

Survey

Record Type No of Records No of Taxa No of Individuals
MV Mammal Trapline Data Observation 4 1 n/a
UNSW SBEES Plant Insect Interactions Locality 2 n/a n/a
Host Plants 2 2 n/a
General Observation 11 10 n/a
Gaia Resources Bird Survey Observation 134 46 329
National Herbarium of Victoria

Observation

10 10 n/a

A sample overview map of records can be seen in the screenshot below – this is the bird survey that we were carrying out as a team as this is a group we were all reasonably all familiar with and we also have a bit of a competition with our director as to which species we’ve managed to twitch.

 

 

 

Fieldwork is Fun

A lot of our clients know that fieldwork can be really good fun and really challenging, but it was good to get some of our developers exposed to the conditions that our clients experience. We got bogged in the mud, caught GR Bird Survey MapCroppedspiders, spotted birds and even went on an after dark reptile spotlighting expedition.

Here’s a slideshow of images either taken by Museum Victoria staff (most notably Mark Norman who is an excellent and prolific photographer) and the Gaia Resources team with their various phones, tablets and cameras.

Mobile Mapping

Val Stajsic from the National Herbarium of Victoria asked AJ if he could convert UTM coordinates into something he could use to find collecting localities on a map. AJ decided to go one better and dump the coordinates out into a KML file that Val can now use in in Google Earth should the need arise. We then also discovered that you can point the Android version of Google Maps directly at a KML file on the web and it will map the points for you. I quickly uploaded the KML file into my Dropbox account and hey presto – a collecting map:

GoogleMapsExistingLocalities

As a result this the three of us Gaia staff found ourselves guiding the botanist team via radio in to a collecting locality. Google’s imagery told us there was a track where we could not see anything directly as weeds had covered up the track.

GoogleMapsExistingLocalitiesFindIt

What about Battery Life? Rain? Dust?

I thought I’d put in a few observations on these things, and some thoughts for the future. Firstly, AJ’s car kit went really well with the Samsung GALAXY Tabs that Piers managed to pick up several months ago. We’ve been looking also at solar chargers, backpacks like the Voltaic Spark Tablet Case or even solar clothing. And in terms of general protection for the devices, as we are talkling consumer grade technology there is a whole ecosystem of suppliers that have answered the call of ‘I want something that will keep my device dry / dust free / unbroken’ including the Otterbox Defender series that’s even available for our tablet devices.

Takehome messages

We all really enjoyed the trip – it is not often that we get to mix with so many leading scientists from such a great range of institutions. Timo summed it up well with his experience – he had a photo of a spider that one of the mammal pitfall traps had caught, and he went to Dr Barbara Baehr who then went on to suggest a website created by one of Australia’s leading Wolf Spider (Lycosidae) specialists , Dr. Volker Framenau who has this exact species as being listed as a member of a new genus. Bushblitz is great as it brings people from a whole suite of disciplines together to work on a great project.

The pleasing outcome for us was that the many scientists we worked with over the course of the trip could see that by using technology such as our tools, they could free themselves up from the data entry burden when they get back to the office.  I think that having us there as a team to support them in the field worked really well, and we had some great discussions about how they could use this sort of technology in the field – ranging from collecting data at the trapping site, knowing that you’ve hit a permit limit for a species, knowing where to go, or tracking where the teams are currently at.  There are many benefits to using technology in the field – our job is to minimise and remove any (real or perceived) risks.

In terms of our mobile software we found that the software setup coped well with custom forms (AKA field data sheets) and census methods (AKA defined methodologies). We found a few easy performance and user interface enhancements which will be going into the mobile tool ASAP. In addition to our software the great variety of other apps available in the various market places really makes commodity smart phones / tablet computers very useful tools to have in the field.

Our aim on this project was to get a prototype field data collection system up and running.  I think we well and truly achieved that, and a whole heap of additional outcomes that came about by just being on the trip itself.  Watch this space for more fieldwork / biogeekery…

Kehan

The post Bushblitz in Ned’s Corner (VIC) with TRIN appeared first on Gaia Resources.

]]>

Plugin by Social Author Bio