Existing similar systems include the Global Biodiversity Information Facility, the Non-indigenous Species Database Network, and Discover Life. These systems use different approaches including caching, cross-database searches, web services, and web scraping. The key question this study answered is what the best approach for creating this system.
A series of user surveys and interviews determined the required features for the system. Different possible approaches were tested for reliability and performance. The best approach was then selected based on how well each approach matched user needs.
Results/Conclusions Surveys of providers showed a wide variance in available information technology (IT) resources. The top user needs included being able to search across multiple databases and map occurrence data from different providers. Performance showed that approaches without a “cache” had linearly increasing search times. With over 200 potential databases this resulted in a cross-database search of over two minutes.
A search time of over two minutes is unacceptable for searching and mapping. Based on this and other results GISIN will include a cache of data from providers. To meet the range of IT resources the system allows providers to contribute data to the cache by: (1) installing a web service on an existing databases, (2) harvesting from a text file on a servers, and (3) uploading a file directly into the cache. The cache will include a high-performance database for executing user queries and creating maps. Individuals and organizations can access data in the system through an easy to use web site or a set of high-performance web services. This will allow a broad range of organizations to develop additional features and perform research such as modeling and risk assessment for the invasive species management community.