data science –

A platform for monitoring forest canopy cover

Chris Roach — Wed, 18 Aug 2021 05:10:30 +0000

We recently completed a technical investigation into leveraging Sentinel satellite imagery for monitoring forest canopy cover in forested areas in Victoria. The investigation was a really interesting deep dive for our data scientists into some free and open-source analytical tools and techniques our client could use to assess one aspect of bushfire risk. The client wanted a repeatable operational tool they could use to hone in on areas of higher risk, and make some informed decisions about where to focus limited field resources.

Forest ecosystems play an essential role in the environment. Monitoring and detecting change in forests is important for the development of conservation policies that would lead to sustainable forest management. For this purpose, Earth Observation (EO) data can be analysed in order to assess disturbances in forest vegetation, as it can reach a worldwide coverage with a high temporal frequency at a low cost. Currently, remote sensing techniques are being used to process EO data from passive and active sensors, providing fast and accurate results across a wide range of applications.

The idea for our particular study was to look at Sentinel-2 (optical) and Sentinel-1 (Side Aperature Radar, or SAR) inputs into a processing model that outputs a regional NDVI raster coverage. The Sentinel imagery captures an image every 5 days for a given patch of the world, and so the potential was there to look at a long term monitoring and change detection tool. We needed to assess the imagery products to see if they could give us useful and consistent output, and groundtruth that against known areas of change on the ground. We also needed to know what technology was out there to help us in this challenge, and what others (researchers and private organisations) had done to solve similar challenges.

The Sentinel-2 platform (left) from the European Space Agency provides coverage every 5 days that can be used for forest canopy cover applications.

We started with a literature review to uncover research that had been conducted on the use of Sentinel and Landsat imagery on forest cover change, both in that south-eastern parts of Australia, nationally and overseas. This step also included looking at current ground forest canopy assessment techniques. With the short term nature of our investigation, we had to really target and timebox our search, and we were able to find some really good material from reasearch at universities across Australia and major research organisations. The Veg Machine project and the previous work done by its originators Jeremy Wallace and Bob Karfs at CSIRO on long-term monitoring using Landsat coverage was an inspiration for our modelling approach. As were the personal experiences of our team members from previous projects and roles.

The literature review had another more software focused aspect to it, as we were looking at a number of analytics platforms that would be the processing backbone and visualisation tool for our modelling. From this we decided to pick up some Jupyter Notebook scripts in Geoscience Australia’s Digital Earth Australia (DEA) platform, and leverage the Google Earth Engine (GEE) platform. The DEA product enabled generating outputs for a regional scale view, and the GEE platform enabled users to produce NDVI plots on demand for a given local area. The two platforms complemented eachother by providing that regional overview and target area time series plots.

The Digital Earth Australia (DEA) platform and Jupyter Notebook scripts configured for the regional comparison of NDVI images against a long-term baseline.

The Google Earth platform enables users to look at time series plots of NDVI for an area of interest.

We devised a modelling approach that would ingest new Sentinel imagery inputs and compare them against a 3 year rolling NDVI baseline. If the new image contained pixels above or below our thresholds, then it would simply show up as a different colour on the mapping: green for significant positive change, red for significant negative change. In this proof-of-concept investigation, the client was happy to look at simply detecting a change of significance; and the reason for that change was something they could target and follow-up on. That reason could be anything from heat stress, planned/unplanned, land clearing, fire activity or disease. We also considered seasonal differences and frequency of images for processing within that modelling approach.

Finally, operational staff at the client’s offices use ArcGIS and QGIS software for a range of mapping and analysis tasks. The final raster outputs from the DEA and GEE platforms are capable of being visualised and analysed further in GIS packages along with other key operational and administrative layers.

GIS software can visualise the raster outputs from the modelling for further analysis and decision support.

So as a first step proof-of-concept investigation, we were able to document a technical and operational approach for our client to detect forest cover canopy change and support their bushfire risk decisions. The next stage coming up will be all about implementation and scaling a solution on AWS cloud infrastructure.

We’d love to hear from you if you have been involved in using any of the tools or applications mentioned here, or you’d just like to have a chat out of interest. Feel free to contact me or hit us up on Twitter, LinkedIn or Facebook.

Chris

The post A platform for monitoring forest canopy cover appeared first on Gaia Resources.

Future Led: Superintelligent AI

Sophie Darnell — Wed, 04 Aug 2021 05:21:39 +0000

I recently attended Future Led: Superintelligent AI – Social Saviour or world threat? The event was the second I have attended in the Future Led series of events being run at Liquid Interactive, a digital experience agency that we partner with. Executive Creative Director at Liquid Interactive, Andrew Duval moderated and despite the depth and breadth of topics to cover helped to keep the conversation informative and engaging within an admittedly tight turnaround of an hour.

The panel was a wealth of both theoretical and applied knowledge, with experience ranging from the practical application and development of AI to the ethical implications. Nick Therkelsen-Terry is the CEO at Max Kelsen, an engineering agency with a focus on AI and ML; Sue Keay, who is the CEO of the Queensland AI Hub and chairs the Board of Robotics Australia Group; Dr Evan Shellshear, Head of Analytics at Biarri, which develops mathematical modelling and predictive modelling solutions; and Justine Lacey, who is the Director of the Responsible Innovation Future Science Platform at CSIRO.

I have to say, I may have originally felt a little intimidated by the experience of the panel, but the speakers quickly absorbed me in their discussions on how the development of superintelligent AI could impact all of our lives; the potential for both positive and negative impacts, as well as what we as a society and as developers of technology should be considering as we move forward.

One of the aspects of the discussion I found particularly interesting was when the panellists raised the question “What is creativity?”. If we program AI that can build upon existing data, even improve upon it, at what point are the outputs of the AI deemed “creative”? While specific definitions of creativity exist and were discussed, what was interesting was that while the definitions themselves were not contentious, how we might interpret them if produced by an artificially intelligent system is not resolved. It was asked by Dr Evan Shellshear how we might differentiate between the artists of the past whose work was not appreciated by the critics of the time and an Artificial Intelligence that was creative in a way we were not yet ready to understand.

The panellist’s conversation around creativity stemmed from questioning how we define intelligence. Existing AIs operate exceptionally well in specialised areas in which they have been trained. But it is that question of how to take machines to the level of intelligence that we have previously only considered to be capable of in humanity (and beyond): is it possible to develop machines that not only build on previous data, but can make intuitive connections and seek out new paths. A phrase often used when talking about software and technology is ‘garbage in, garbage out’. It is based on the idea that a computer can only operate based on the code, or instructions we give it. Therefore, if we develop intelligent machines that operate based on the information that we provide, that machine can only make decisions based on the message of that data. There have been some abysmal examples of Artificial Intelligence solutions developed that not only perpetuate societal and cultural biases of the humans that designed them, but also in some cases worsen their impact. The panellists discussed that not only is AI a tool which we have to develop mindful of our own inherent biases, but it also has the potential to be used as a tool to help us to identify and understand those biases. This could then be a part of a conscious step towards a society that identifies, acknowledges and addresses when our subconscious bias is contributing to unfair outcomes for minorities.

Given that we imagine using a future superintelligent AI to solve the problems that as yet remain unsolved, we can’t be certain if the solutions to those extraordinarily complex problems will emerge in quite the form we imagine – Dr Justine Lacey raised the question of whether artificial intelligence might not take such a different approach to problem-solving that its solutions might not actually provide the solutions we expect. While the panellists were primarily optimistic about how we might use an artificial superintelligent machine to improve our society, it seems that we need to look at both the practicalities and technology required; while also considering the theoretical implications of what it might mean for us as humans. Nick Therkelsen-Terry in particular spoke strongly about the importance of investment and research in this area for Australia. I think given the amazing potential and opportunity to be had in the area, let alone as an industry, this is something that most of us could agree on.

While there is still a way to go before the advancements of AI and ML are truly ‘superintelligent’, there are still so many problems that Artificial Intelligence and Machine Learning are already helping us solve. Just a couple of months ago, our Data Science Unit Lead, Chris Roach, shared a blog about the exciting results we had with the protype fish-identification product that we developed in partnership with the Global Wetlands team at Griffith University. This project was part of the “Counting Fish” challenge that was put forward by the Australian Institute of Marine Science (AIMS) to address the problem of highly manual fish identification work that currently requires significant man hours and resources for the collection of marine data. The prototype showed us we can successfully reduce the requirement on human resources to collect this important data, allowing us to contribute to research and decision making in the marine sciences much more effectively and efficiently.

If you have a problem that needs solving, and would like to discuss how Gaia Resources could help you solve it, please feel free to get in touch with myself or our Data Science Unit Lead Chris Roach. Alternatively, hit us up on Twitter, LinkedIn or Facebook.

Sophie

The post Future Led: Superintelligent AI appeared first on Gaia Resources.

Innovations in Biodiversity Data Management

Sophie Darnell — Wed, 23 Jun 2021 01:33:46 +0000

As part of a webinar series hosted by the CSIRO, the Atlas of Living Australia (ALA) recently held a webinar series that in June included a session on Innovations in Biodiversity Data Management. Mieke and I were able to attend and wanted to share with you how it went. It was interesting to be able to get an overview of some of the many biodiversity information projects going on, particularly since this is an area of interest for both Mieke and I, and one that we work on within Gaia Resources – and many of the initiatives are very close to our corporate mission, too.

The first speakers were from the National Herbarium of NSW, discussing their current digitisation project. Triggered by receiving funding for the construction of a new facility, the Herbarium sought additional funding initially through crowd-sourcing to allow them to launch the digitisation project in parallel. Due to the size and scale of their project, we heard from three speakers; Jo White, Hannah McPherson and Kevin Noakes. At the time of the webinar, they were approximately halfway through a collection of 1.4 million items! At those volumes, the project needed to process 500 images every hour. While they have had some delays due to COVID-19, it was great to hear about the progress that has still been made in making their collection available as a digital asset for Australia and worldwide.

The digitisation project is providing image data from the collection that opens up opportunities in new areas; one of those mentioned was the use of Artificial Intelligence (AI) for the characterisation of different species; as well as the future potential for it to be used in identifying changes in leaves due to latitude and longitude! This is something we will have to also review for a few of our projects we’re undertaking with AI (like the previous work we’ve just talked about with Fishscale) – more on those projects in a later blog. In what may seem a more prosaic outcome for most of us, but which will have massive benefits for collection managers, they are also already finding that loans of physical items are in some cases no longer required due to the quality of the digital scans. This helps to keep the physical items in better condition for longer, which extends the longevity of the collections for the future.

After hearing about the herbariums digitisation project, Larissa Braz Sousa spoke to us about the Great Southern Bioblitz; an international event during Spring in the Southern Hemisphere. It aims to use citizen scientist surveys to capture the biodiversity of different regions and the most recent Bioblitz captured over 90,000 observations! Some of the challenges in the event that she discussed were in getting communities engaged in the event, ensuring correct identification of species (specifically a shortage of plant and fungi specialists was mentioned); ensuring appropriate access to and usage of the data; and trying to use the data gathered in conversations of decision-making around biodiversity. It was great to see that so many people around the world engaged with the event and contributed to the data collected – while we are not undertaking as much citizen science involvement these days as we once did at Gaia Resources, we are still keen to hear what’s happening and see how we can continue to support these types of important community initiatives.

The final speaker was Ron Avery, who manages the Biodiversity Information System in NSW, and who we’ve worked with on a number of projects to date, like our BioSys project. Ron spoke about the importance of ensuring that biodiversity information could be made readily available for decision makers; highlighting that while there are biological data aggregators around the country, there is need for further cooperation and investment to coordinate a national approach to biodiversity information. We share Ron’s views at Gaia Resources and are attempting to facilitate that sort of coordination across a range of projects we’re involved in.

All three speakers spoke about the importance of engaging the community, as well as ensuring that biodiversity information can be made available and understandable to the widest range of people within it as possible. Of the challenges that still exists in this area is ensuring the greatest value is returned from biodiversity information collection efforts, I think that the two that are still the most critical for us to overcome can be encapsulated as:

Providing platforms for the collection and sharing of biodiversity data, and
Ensuring that this data is made available to the community and decision makers in a meaningful way.

These challenges aren’t new to those of us already working in the domain – we’ve worked very hard on these previously – but there has been renewed interest and investment in the domain as we try to find new ways to address the United Nation’s Sustainable Development Goals and provide better data for Australia’s own State of the Environment reporting. Hopefully by providing that connection between the research and resources collected and bringing impact-driven feedback into the broader community and to decision makers, we can get better outcomes for the environment and Australia as a whole. I am excited to be working in this space at a time when there is so much opportunity for us to bring more effective data management that can help decision makers and the community.

There’s a lot more in terms of biodiversity innovation out there – and in here at Gaia Resources – so we’ll likely make this an ongoing theme in our blogs over the next few months to outlined a few more of the ways in which we think innovation is occurring as well, and how we can push that envelope a bit further!

If you have any challenges or collaborations you can see on the horizon for biodiversity data within your realm, or even if you have any questions, feel free to email me or strike up a conversation on Twitter, LinkedIn or Facebook.

Sophie

The post Innovations in Biodiversity Data Management appeared first on Gaia Resources.

Artificial Intelligence for fish species identification

Chris Roach — Wed, 16 Jun 2021 01:30:49 +0000

As we wrote in our previous blog, the “Counting Fish” challenge was put forward by the Australian Institute of Marine Science (AIMS) as part of a call-out to look at innovative and streamlining technologies for a widely used method of marine research data collection. The Commonwealth Government’s Business Research and Innovation Initiative (BRII) has provided the grant funding and program to bring the best minds and solutions to tackle the challenge. Together with our partners at the Global Wetlands team from Griffith University, we’ve recently finished up the first stage which was an intensive 4 month Feasibility Study.

The study focused on BRUVS (Baited Remote Underwater Video System) footage, and leveraging artificial intelligence (AI) and machine learning (ML) technologies to collect data and accelerate our understanding of fish in our oceans. AIMS and other researchers spend a lot of time manually capturing data from the videos, so finding efficiency measures and improvements to data consistency and quality would be of tremendous value. Out of the study we built a prototype application for processing and visualising BRUVS data, including automatically identifying and counting tropical fish species.

Taking the OzFish open dataset and many hours of AIMS BRUVS footage, the team focused on training the AI model to accurately identify a range of fish species representing rare and common fish, fast moving and very small species, schools of overlapping fish and also differentiating morphologically similar species. Demonstrating the effectiveness of our method for these specific challenges, allowing us to produce quantified, highly accurate results. We are now able to look confidently ahead towards tackling hundreds of species that live in Australia’s tropical waters.

The Fishscale online prototype – video metadata and playpack showing annotations and count statistics.

When we look back at it, we’ve achieved an incredible amount in a short space of time. Our nationally distributed team (Perth, Brisbane, Darwin) worked really hard to make sure we were on the same page and productive with online meetings, collaborations and workshops. This was no small feat when you think we had two COVID-19 lockdowns affecting our Queensland team members.

With a new Fishscale prototype web interface, a new BRUVS video can be uploaded and processed within minutes. While the researcher grabs a coffee, it generates the statistics they need to help model and understand population ecology and fish behaviour. There’s an important human quality control element as well, meaning that fish experts have the ability to make corrections, improve the model and increase the value of their data.

We really enjoyed the regular interaction with the AIMS team as well, which helped us to design our Fishscale prototype with exciting features that will eventually deliver lots of value and efficiency gains for research workflows and other industry applications.

So what happens next? Well, there is still plenty to do if we progress to the next phase. We know there are still challenges around much larger numbers of species, variations in water quality and environmental factors. In phase 2, our plan includes customising the user interface to adapt to different user types depending on their requirements for data capture and output. Different products based on the AI framework will have different audiences in mind depending on whether they come from research, monitoring, education, or not for profit groups.

We are confident this is just the beginning of an exciting journey to develop a highly valuable product for streamlining research workflows and generation of important statistics. In fact, the Proof of Concept phase starts up around September, and we are hopeful we can progress and continue working with AIMS on this key initiative.

If you are interested in this space or are someone who works with underwater videos and fish identification, we would love to get your perspective for future development. Feel free to give me a call or an email though if this type of work interests you – strike up a conversation on Twitter, LinkedIn or Facebook.

Chris

The post Artificial Intelligence for fish species identification appeared first on Gaia Resources.

Biodiversity and high-performance computing

Piers Higgs — Wed, 19 May 2021 01:33:28 +0000

On Friday, 14th, May, I went to an event at the University of Western Australia, titled “Global biodiversity hotspot with cutting-edge compute!”, hosted by The DNA Zoo.

The title intrigued me – as this is basically what Gaia Resources was set up to do from the start, way back when I founded it in 2004. The aim was to combine my interests of technology and the environment to see how we can help make the world a better place, which has become our formal mission statement in the last couple of years:

“Gaia Resources is a consultancy that responsibly delivers sustainable technology solutions to make the world a better place.”

The event was primarily aimed at an academic or research audience, although the speakers came from a few different places and from a few different backgrounds, including research organisations, industry and government organisations. The event itself was a free-flowing panel discussion, which took me on a pretty interesting journey for a Friday afternoon. Along the way, there were a range of discussions around some pretty interesting topics, and resonated around a range of our existing work in the data science and high performance compute space.

One of the threads that stuck with me was around the lack, or lag, of skillset development in the biodiversity and high performance computing space. There were a few other threads around the economics of the environment, and societal changes, but the skillset one really stuck in my head across the weekend afterwards, so I’ve focused this blog on that aspect (and there are heaps of different theories and studies around this lag out there in business management circles).

At Gaia Resources, apart from working on a wide range of interesting projects, we try to develop our skills across multiple areas, including subject matter areas (e.g. archives, biodiversity, etc), and also in the use of technology (our Amazon Web Services partnership is an example of that). However, we also need people to interface between them, which is where our Data Science unit comes in – on our team there are a mix of subject matter experts, technologists, and analysts that sit in between and help facilitate development of highly valuable solutions for our clients.

At the event, I noted multiple times that there are a range of projects or programs we are involved with around Australia to try to address that lag, such as the Business Research and Innovation Initiative grants, which we recently were successful in receiving. Providing opportunities for researchers and industry to work together often takes that external stimulus, and that’s an important thing for governments and other groups to consider if we want to reduce that lag across the board.

Another good example of this lag, and how we can help address it, is via our support for the Taxonomy Australia Decadal Plan for Taxonomy and Biosystematics in Australia and New Zealand (a link is provided below). This plan is all about accelerating species discovery in the region – so that we can discover and document this biodiversity in a generation, rather than several centuries. This is aligned so clearly to our own mission at Gaia Resources that we’ve stepped in to volunteer our services to assist Taxonomy Australia, and we’re providing support and developing some technology prototypes for them that might help with this acceleration (more on that to come in a future blog).

Taxonomy Australia’s Decadal Plan

This also is why we work with Archives around Australia. One thing that was discussed at the event was that there is a challenge of not being able to store biodiversity research data in a sustainable way – something that the Archives have managed, and that we’ve been working collaboratively with them on for some time. Looking outwards at the Archives industry, especially around Digital Preservation activities, such as the pilot we’ve just completed for Queensland State Archives (again, more on that to come in another blog) will certainly help to bring that expertise back into the biodiversity community. Those sort of linkages are often overlooked!

These examples are all part of how industry can help reduce that lag between innovation or research and implementation, how a cross-industry approach delivers benefits, and how industry can apply our technology skills to organisations that need that support to help them innovate and accelerate. In the meantime, we’ll keep on working towards trying to cross-fertilise and cross-pollinate projects and to bring our expertise to where they can deliver a real value to the people we work with – and keep on trying to make the world a better place, one project at a time.

If you’d like to know more about what we can do for your organisation, in terms of data science, technology of biodiversity information, then drop me a line, call our offices on (08) 92277309 (Perth) or 0438 718 164 (Brisbane) or strike up a conversation on our Twitter, LinkedIn or Facebook pages.

Piers

The post Biodiversity and high-performance computing appeared first on Gaia Resources.

Checking in with our (spatial) Data Scientists

Chris Roach — Wed, 12 May 2021 01:30:50 +0000

Our Data Science team meet regularly to talk about our workloads, upcoming projects and emerging technologies. With software engineers, business analysts, spatial analysts and technical project managers on board, we have quite a variety of skills and experience; but a shared passion for leveraging all sorts of data sources, tools, software and infrastructure. I think we all love coffee too!

The part of our team that normally work with GIS (Geographic Information Systems) software are noticing a shift from static to dynamic deliverables, and they are adapting their skills and knowledge to meet a changing demand in consultancy. The solutions they are contributing to now are data-driven decision support tools, monitoring dashboards, change detection toolsets, apps and websites. Desktop analysis is still a key feature of the workflow; but they work more closely these days with our software engineers and research partners to go deeper into the tech, designing models and technical processes that can be run on cloud infrastructure such as our solutions using Amazon Web Services. More and more we are leveraging real-time data sources to assist our clients too like daily satellite feeds, web APIs, sensor arrays or business systems data.

I thought it could be timely to check in with our (Spatial) Data Scientists, and ask them to look back over the last 11 months or so to recap on some of the interesting projects we’ve been involved in. This is what they had to say:

Jake Geddes – Senior Spatial Analyst

1. What are you working on at the moment, and what makes your spatial work special?

I’m currently working alongside a resources company’s Heritage Team to consolidate their heritage data into a master database as well as offering GIS support. I think the majority of spatial work we do at Gaia Resources is quite special, because you often get the feeling that you really are making a positive difference to the environment and people’s lives.

In another project this year, I advised our mobile app developers who were working on algorithms to visualise bushfires in real time. I also work with clients to establish spatial data standards around biodiversity and other environmental data to support their regulatory reporting requirements.

Real-time bushfire hotspots and fire scars are generated from satellite imagery and rendered on a mobile app from web services such as WMTS and GeoJSON feeds.

2. We are getting close to the end of the financial year – what was the highlight for you in the last 12 months?

There have been a few highlights for me despite a somewhat difficult and chaotic year for “Earth”:

Gaining a deeper understanding of how spatial and software engineering interact, such as in the Retromaps project I was involved in.
Supporting multiple environmental organisations in collaboration and conservation (e.g. WABSI, Greening Australia, multiple NRM groups)
I was able to lead a GIS health check investigation (find out more about these here)

3. What do you see as the link between Data Science and GIS?

Consolidating Big Data sources into something usable and effective can be quite challenging, but there are now numerous tools and processes in GIS which move beyond the consumption of raw data products and create links with data science workflows. Being able to incorporate a geographic and visual aspect is a great advantage of GIS in Data Science applications.

4. Any new widgets or tools you’ve discovered, that you’d like to tell the world about?

I have been having a play with a few statistical packages and looking into how they can integrate with our spatial analyses. These include:

R software for statistical modelling
GeoPandas (and Fiona): an open source project to make working with geospatial data in python easier.
kepler.gl: an open source geospatial analysis tool for large-scale data sets.
enso.org: a graphical user interface for automating data-driven processes

For years now I have been getting stuck into multiple plugins in the free and open source QGIS software, such as the Semi-Automatic Classification Plugin (SCP) and data plotly.

Barbara Zakrzewska – Spatial Analyst

1. What are you working on at the moment, and what makes your spatial work special?

At the moment I am working for a mining company and helping them with environmental approvals, disturbance and rehabilitation. It’s a huge company that used contractors before the current GIS Coordinator joined and started setting up a new holistic system. One of my favorite aspects of this type of work is building spatial systems in a manual sense and then creating models that automate repeatable workflows. I like building single-source-of-truth geodatabases, where incoming data is either handled automatically (e.g. via government download, seamless data feeds, custom scripts) or signed off by a knowledgeable data owner.

In other recent work, I supported our mobile app developers on a Transport Innovation project, to standardise and translate council parking data for inner Sydney suburbs.

2. We are getting close to the end of the financial year – what was the highlight for you in the last 12 months?

My highlight has been my current roll, where I have access to proprietary software such as FME and exploration data in enterprise databases like acQuire. I have created several useful models that combine business and spatial data, and was asked for input on building and improving the enterprise GIS system.

3. What do you see as the link between Data Science and GIS?

I see GIS and other spatial modelling tools as vital components of many Data Science applications. Several GIS programs come with data science plugins, calculators and modelling tools that enable further analysis and data interoperability.

4. Any new widgets or tools you’ve discovered, that you’d like to tell the world about?

While I did not discover new amazing tools in the last year, I was able to use FME and QGIS – and my knowledge of the business requirements – to achieve some interesting and valuable outputs.

Rocio Peyronnet – Spatial Analyst

1. What are you working on at the moment, and what makes your spatial work special?

I am currently working on a change detection in forest canopy health project for Victorian government. We developed two models that account for variation in the canopy condition using Sentinel-2 imagery, in cloud-based platforms: Jupyter notebooks and Google Earth Engine. What makes this work special is that this model will help in the operational assessment of vegetation condition in an easy and quick way, allowing the user to focus on the areas that require attention more urgently.

Tree canopy health change – or difference in Normalised Difference Vegetation Index (NDVI) – is modelled with Jupyter Notebooks and imported into ArcGIS software.

2. We are getting close to the end of the financial year – what was the highlight for you in the last 12 months?

Definitely, knowing that the work we are doing will support fire risk assessments and prevention of this kind of risk is the highlight of my year. I have only been part of Gaia Resources a short time, but I am really happy to be working for a consultancy and team focused on the delivery of sustainable technology solutions.

3. What do you see as the link between Data Science and GIS?

We are living in times in which data is constantly generated. Data scientists manipulate and analyse all that data to obtain information from it, but when the location parameter is present in the data, GIS can be put into action. By using GIS we can visualise spatial patterns in our data and present them in a map, providing a better understanding of where our information is positioned.

4. Any new widgets or tools you’ve discovered, that you’d like to tell the world about?

I am personally fascinated by Digital Earth Australia, a platform for open source analysis developed by the Australian government. It uses spatial data and satellite imagery to detect changes across Australia, providing codes and tutorials so users can perform their analysis, all free of cost.

————————————

What we’ve noticed is that most of our client engagements are no longer just standalone analyses, maps or even webmaps. They want consultants to provide value to the business – through streamlined workflows , integrated systems and insights – to help them make more effective and timely decisions. This invariably involves our spatial analysts, business analysts, software and devops engineers to come together to bring the solution together. As Data Scientists, we all agree that there are loads of tools and data sources on offer – but that the key to a successful project is to focus on the challenge or outcome, and to build out a plan that leverages the right tools, data and modelling approach for the task at hand.

If you have a perspective on the changing landscape of data science and the spatial industry, feel free to give me a call or an email. Or, strike up a conversation on our Twitter, LinkedIn or Facebook pages.

Chris

The post Checking in with our (spatial) Data Scientists appeared first on Gaia Resources.

The NAFI app is changing the way work is planned in the field

Chris Roach — Wed, 28 Apr 2021 01:40:21 +0000

Controlled burning is underway across the western and central parts of tropical north Australia. As we move into the dry season and the floodways on our Top End roads become accessible, indigenous groups, parks managers and farmers are keen to get those early season burns in full swing. This type of fuel mitigation burning happens at a time of year when there is moisture in the soil and vegetation, in order to limit more catastrophic bushfires later in the season when everything has dried up. It reminds me of the explanation Dom Nicholls from the Mimal Rangers gave me over a coffee chat last year, when he said in East Arnhem land they begin their programs as early as they can get the flames to take hold in the grassy vegetation – in March if they can get road access – and then race to fill the gaps later using fire scar mapping and careful planning.

Farmers like Mark Desaliy can use the app to monitor fires near their stations.

Our initial release of the North Australia and Rangelands Fire Information (NAFI) app for iOS and Android back in February brings the most used fire information resource for land managers in Australia to your phone, allowing you to keep a constant eye on bushfire threats. You can view maps of satellite generated fire activity (hotspots) and burnt areas (fire scars) provided by the NAFI service. There’s a good summary back in March from Rohan Fisher on ABC Radio – Kimberley.

At a regional scale like this area in northern NT and WA, the NAFI app represents real-time hotspots through a heat map clustering algorithm.

Just to recap on how the app works behind the scenes to provide you with real-time fire information:

The hotspot locations are updated several times a day and the fire scars are updated up to once or twice a week depending on fire conditions.
The fire scars are produced by the NAFI Service and the hotspots are sourced from Landgate WA and Geoscience Australia.
Base maps for imagery and topography can be downloaded for offline use in your region of interest, and then used for when you go outside of mobile data range.
Burnt area mapping covers the Australian Savannas and rangelands that comprise around 70% of Australia, but does not cover NSW, VIC or the heavily populated regions of QLD, WA and SA.

So how popular is the NAFI app – well we can monitor a number of analytics using iOS AppStoreConnect and Google Play console, or the Firebase dashboard. These are configurable dashboards that can tell us things like how many installations occurred by day or week, how many are actively used, and filtered by operating system or device type. As of today, the iOS app has been downloaded 288 times since it’s initial release, and the Android version 142 times.

AppStoreConnect dashboard for the iOS NAFI app provides statistics of installations by week since the mid-February release.
Google Play Console shows the increase in installations of the Android NAFI app over time since the mid-February release.

We expect installations to continue upwards in the month of May and beyond, as more people on the ground become aware of the benefits and utility of the app. There are two phases of bushfire related activity where the app can be useful, associated with the early Dry season burn programs and carbon (emission reduction) projects, and the late Dry season bushfire response.

The statistics are anonymised so we are not tracking personal information, but what the out-of-the-box analytics does help us to understand are the trends, and – along with ratings and word of mouth – we get a bit more insight into how people are reacting to the app. This can then feed into our strategy with clients on helping them target marketing campaigns and prioritise enhancements. We also utilise Firebase Crashlytics as a way of logging the details of any crashes and error messages received, and this really helps us get quickly to the root cause of a technical issue a particular user is experiencing.

Please be aware if you are using the app:

Hotspot location on any map may only be accurate to within 1.5 km
The hotspot symbol on the maps does not indicate the size of the fire
Some fires may be small, brief, or obscured by smoke or cloud and go undetected
Satellites detect other heat sources such as smokestacks

For more information visit: https://savannafiremapping.com/nafi-mobile-app/

If you would like to know more about our projects with the NAFI team, or want to strike up a conversation by sending me an email or getting in touch on Twitter, LinkedIn or Facebook.

Chris

The post The NAFI app is changing the way work is planned in the field appeared first on Gaia Resources.

The data science of plant trait data

Alex Chapman — Wed, 27 Jan 2021 01:01:58 +0000

Data Science is a large and growing multidisciplinary field that employs scientific method, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. It aims to unify data analysis, machine learning and related methods to understand the complexity of the world through large, often aggregated datasets. Together with Data Analytics – the discovery, interpretation, and communication of meaningful patterns in data – they are especially valuable in areas rich with recorded information. We’ve done a lot of work in both Data Science and Data Analytics at Gaia Resources over the years.

What prompted me to focus on Data Science and Analytics in this week’s blog is the imminent publication of a paper I contributed to – ‘AusTraits – a curated plant trait database for the Australian flora’ (Falster D et al., 2021 – in press). As the paper says:

“AusTraits synthesises data on 375 traits across 29230 taxa from field campaigns, published literature, taxonomic monographs, and individual taxa descriptions. Traits vary in scope from physiological measures of performance (e.g. photosynthetic gas exchange, water-use efficiency) to morphological parameters (e.g. leaf area, seed mass, plant height) which link to aspects of ecological variation. AusTraits contains curated and harmonised individual-, species- and genus-level observations coupled to, where available, contextual information on-site properties. This data descriptor provides information on version 2.1.0 of AusTraits which contains data for 937243 trait-by-taxa combinations. We envision AusTraits as an ongoing collaborative initiative for easily archiving and sharing trait data to increase our collective understanding of the Australian flora.”

I and other colleagues from the Western Australian Herbarium were invited to contribute our data from the Descriptive Catalogue initiative, which contributed a small number of observed traits for c. 12,000 WA plant taxa. To my mind, one key strategy for data science is that major datasets are developed and maintained in a manner that can contribute to even larger integrative projects such as AusTraits, for further data analysis, again as we outline in the paper:

“AusTraits version 2.1.0 was assembled from 351 distinct sources, including published papers, field campaigns, botanical collections, and taxonomic treatments. Initially, we identified a list of candidate traits of interest, then identified primary sources containing measurements for these traits. As the compilation grew, we expanded the list of traits considered to include any measurable quantity that had been quantified for a moderate number of taxa (n > 20). To harmonise each source into the common a format AusTraits applied a reproducible and transparent workflow – a custom workflow to clean and standardise taxonomic names using the latest and most comprehensive taxonomic resources for the Australian flora: the Australian Plant Census (APC) and the Australian Plant Names Index (APNI).”

The AusTraits project is hosted by the Australian Research Data Commons (ARDC) formed in July 2018. The ARDC is “a transformational initiative that aims to enable the Australian research community and provide industry access to nationally significant, leading-edge data-intensive eInfrastructure, platforms, skills and collections of high-quality data”. This hosting contributes towards the maintenance aspect I mentioned above.

Full details on those processes will be available in the forthcoming publication, a link to which I’ll add when it becomes available. Meanwhile, if you’d like to know more about this project, or about what we can offer in the Data Science and Analytics areas, please drop me a line at alex.chapman@gaiaresources.com.au, or connect with us on Twitter, LinkedIn or Facebook.

Alex

The post The data science of plant trait data appeared first on Gaia Resources.

Counting fish – supporting research at the Australian Institute of Marine Science

Chris Roach — Fri, 15 Jan 2021 04:32:26 +0000

You may have heard the news on Tuesday from the Commonwealth government media release where the Minister for Industry, Science and Technology Karen Andrews announced grants to develop products that improve our natural environment. We are very excited to announce that Gaia Resources (mentioned as Tekno Pty Ltd) is one of the grant recipients, and we have some really exciting work ahead of us in the coming months.

Together with some excellent partners and the Australian Institute of Marine Science (AIMS), we will be looking at leveraging Artificial Intelligence (AI) and Machine Learning (ML) tools to identify fish species, counts and other measures from underwater video footage. Tailored to the research challenges that AIMS faces, we are hoping our work will continue on to develop products and insights that can streamline marine research programs and conservation efforts. Our focus will be to support scientific understanding of critical issues and build online tools to streamline and expand the capacity and program effectiveness.

Our team is really looking forward to getting started, and I am sure we will have an update for our interested readers in a few months time. Feel free to give me a call or an email though if this type of work interests you – strike up a conversation on Twitter, LinkedIn or Facebook.

Chris

The post Counting fish – supporting research at the Australian Institute of Marine Science appeared first on Gaia Resources.

The ALA and Big Data for Biodiversity

Mieke Strong — Fri, 11 Dec 2020 00:29:21 +0000

On Wednesday 9 December, Chris Roach and I attended a webinar hosted by the Atlas of Living Australia (ALA), celebrating its 10 years of existence and showcasing research into the role of Big Data and data science modelling techniques in managing Australian biodiversity. It was a chance for me to also reflect on my journey in parallel with the ALA in the early days when I was at the Western Australian Museum. I was involved there in aligning the Arachnology database fields with the TDWG Darwin Core standard, so the web team could mobilise our data; then later in environmental consulting; and now here at Gaia Resources where we share much of the ideals of the ALA in enabling open biodiversity data sharing and aligning to internationally recognised standards.

The following provides a summary of some of the important research that was described in this particular seminar series of three speakers.

With platforms such as the ALA, the amount of biodiversity data available has dramatically increased in the last 10 years and empowered biodiversity conservation with so much more confidence in actions undertaken; but many of the ecological challenges that we have faced in the past still remain. These challenges can be summed up in three main areas:

Sampling bias,
Incomplete coverage and,
Data quality.

Professor Melodie McGeoch (La Trobe University) discussed the importance of not just focusing on documenting populations of threatened, vulnerable, and endangered species; but also the need to recognise the importance of occurrence data for “common” species. Whether a species is recognised as common depends on temporal trends, local abundance, and spatial range; and significant declines in any of these areas may go unnoticed when a species is thought to be common enough not to require frequent monitoring. In terms of identifying refuges for preventing diversity and biomass decline, Prof. McGeoch advocated for the modelling of ALA and other data of both rare and common species at a more localised level to understand geographic variation and abundance over time.

PhD candidate Tianxiao (August) Hao (University of Melbourne) used his research in fungal diversity in Australia to show the rapid increase in data availability. Some of this data, however, is unreliable, and so careful consideration must be taken prior to analysis as to whether the data is of a high enough standard to be useful. He acknowledged the new technology and rigorous screening that new data submitted to the ALA undergoes and the large clean up operation that is underway to increase the quality of legacy data.

Both August and Professor Jane Elith (University of Melbourne) demonstrated how the available data is still biased greatly by sampling effort due to environmental or logistical constraints. It makes sense that the easiest to reach places, such as areas near population centres, coastlines and, along roads are the most heavily sampled.

Professor Elith also highlighted the much forgotten bias introduced by a deficiency in absence data. Most ‘observation’ records are for presence data, but having knowledge of what areas have been sampled (and how) without finding occurrences, is possibly of equal significance to documenting the presence of species. Predictive modelling of species distributions are so much more powerful when they can account for bias and ideally this presence-absence type of data capture should be integrated into research and citizen science initiatives.

Professor Elith showcased the eBird initiative as a good example of where using citizen science can provide comprehensive coverage of occurrence data over time.

Gaia Resources is no stranger to considerations of presence-absence data and has developed several Citizen Science solutions over the years. We have also worked with conservation groups like the Great Victoria Desert Biodiversity Trust to plan habitat survey strategies (check out our blog here).

With the help of open-access biodiversity data such as that provided by the ALA, we can all play a part in overcoming the challenges faced in conservation. Here’s to the next 10 years!

If you’d like to know more about this topic or would like to discuss your own Big Data and biodiversity projects, please drop me a line at mieke.strong@gaiaresources.com.au, or connect with us on Twitter, LinkedIn or Facebook.

Mieke

The post The ALA and Big Data for Biodiversity appeared first on Gaia Resources.

Data Science Upskilling Bootcamp

Barbara Zakrzewska — Fri, 24 Jul 2020 00:30:43 +0000

A new and interesting trend is emerging as teachers from all over the world embrace virtual training. It opens many opportunities for both facilitators and knowledge-seekers from different continents and time zones. Training formats are changing too, as there’s no need to hire a venue and sessions can be shorter and better tailored to an audience’s attention span. Our world is changing and with better internet speeds we can now learn from any expert in the field, if they are happy to share their knowledge.

The recent WADSIH/Halliburton Data Science Bootcamp was the first virtual training event I’ve ever attended. For one week, Dr Satyam Priyadarshy and his team of scientists were guiding us through the world of data science, machine learning, the use of neural networks, and artificial intelligence. The unusual factor was that they were doing so from different parts of India.

On the first day of training, Dr Priyadarshy quoted the ancient parable of the Blind Men and an Elephant to help us understand fundamental data science and machine learning concepts.

So, how do we make sense from the sheer volume of data that is collected nowadays? What are the best ways? It can be very hard to see the big picture when one is overwhelmed by data coming just from one strategy or process in an organisation.

That’s where data science algorithms and tools come in – to find patterns and reveal how seemingly separate processes influence each other. Machine learning allows us to look at multiple variables and predict behaviours.

In this workshop we used Jupyter Notebook, accessible through Anaconda – a scalable data science platform. Because it is browser-based, I could verify each line of python code straight away. I was quite surprised how easy it was to remember syntax and to grasp the content of each learning module. Soon I was looking forward to each of the three-hour coding sessions.

We used python modules to predict trends and derive patterns from seismic data. Neural networks and machine learning algorithms were explained, demystified and illustrated through experience. And to my surprise it was fun! I look forward to implementing this new knowledge in future Gaia projects.

If you’d like to know more, feel free to reach out and see what Gaia Resources can do for you in this rapidly developing space. Comment below, contact me at barbara.zakrzewska@gaiaresources.com.au, or start a chat via Facebook, Twitter or LinkedIn.

The post Data Science Upskilling Bootcamp appeared first on Gaia Resources.

Twenty years of WA floristic data

Alex Chapman — Wed, 03 Jun 2020 07:41:23 +0000

I’ve been with Gaia Resources for six years now (!), and grateful to have found a home where my specialist knowledge of taxonomy, systematics and biodiversity informatics adds value to the enterprise.

I remain a research associate at the Western Australian Herbarium, both in order to continue my research into WA’s heaths – the Ericaceae – and to provide some institutional services.

One of these has been to collate and summarise significant changes in flora statistics from year to year, starting with the publication of the Descriptive Catalogue (Paczkowska and Chapman, 2000). This included a simple table of major floristic data, and a comparison with previous stats from the past century (updated here in Figure 1).

Exactly the same data was also used in a major paper comparing species richness and endemism in Mediterranean biomes (Beard, Chapman and Gioia, 2000). All this was made possible by my work with the greatly missed Paul Gioia, with whom I built the first statewide census database for vascular and cryptogamic flora – WACENSUS – begun in 1990. Of course, there were previously printed publications by John Green (1981, 1985) and maintained in supplement subsequently by Nicholas Lander, that provided the initial source data. The WACENSUS database, however, enabled real-time documentation of plant names in play for the State, and an immutable Life Science Identifier (LSID) that could be referred to across information systems.

In that time, we have seen the flora statistics document a steady increase in the number of species, both published and ‘putative’ (i.e. manuscript and phrase-name taxa), grow steadily (Figure 2).

2020 marks a number of milestones for the State’s documentation of our precious and unique flora:

the 50th anniversary of the first edition of Nuytsia – WA’s systematic botany journal, that provides much of the published taxonomic work describing and classifying the States’ flora;
30 years since ‘the Census’ became a functional database underpinning authoritative, accurate and an up-to-date source for plant names in current use (and their synonyms) in WA;
20 years since ‘The Western Australian Flora – A Descriptive Catalogue’ was published, from which the descriptive query capacity of FloraBase was drawn;
20 years since the last major analysis of the uniqueness of our State’s flora (especially with regards to other Mediterranean floras of the world);
17 years since the last major revision of ‘FloraBase — the Western Australian Flora’ was released.

In the intervening years, Paul Gioia worked to manually integrate the Western Australian Museum’s faunal names into the Census as well, in order to maximise that knowledge in his major work – NatureMap (2007 onwards). As a result, WA has a standardised names dataset for much of the biota of the state.

It is very heartening to me to see the development of the next generation of WA’s biodiversity information systems through the work of the newly-funded Biodiversity Information Office – see last week’s post.

Yesterday, I received the 2019-20 flora statistics data, from which I will extract the significant highlights and changes for the past year. This complex report was only automated (after years of testing for veracity against my manually-calculated version) in 2019. Again, this could not have been achieved without the dedicated work of Paul Gioia, Ben Richardson and the invaluable curatorial team at the WA Herbarium. These results will be published in FloraBase in coming weeks.

UPDATE: The 2020 flora statistics are now available.

If you’d like to discuss any of the topics covered in this post, please drop me a line at alex.chapman@gaiaresources.com.au, or connect with us on Twitter, LinkedIn or Facebook.

Alex

The post Twenty years of WA floristic data appeared first on Gaia Resources.