As promised in my previous post , this post is about using Kobo Toolbox as a data collection tool for archaeology. I also address important issues that have materialized as I design and use forms in Kobo Toolbox.
What is Kobo Toolbox? Kobo Toolbox is an open source suite of tools for the collection and analysis of humanitarian data- especially in remote places or after a disaster. Kobo Toolbox is comprised of three different tools, one for form design, one for data collection and another for data analysis (the latter is rather simple and won’t be discussed here). Kobo Toolbox can be used from their website or installed on a local server. It was designed by a fairly large collective organized by the Harvard Humanitarian Initiative and supported by lots of heavy hitters, including the United Nations, the International Rescue Committee and US government- through USAID. It is very well supported and has a very active and dynamic user and collaborator community.It appears to be sustainable. Similarly, the data collection form is based upon Enketo, another open source, well supported and actively growing tool.
Why Kobo Toolbox? First and most importantly, Kobo Toolbox is device independent (though there is an Android app as well). Instead of operating via an app, it runs as a webform that can be accessed through a browser on any device that can run a browser (browser does need to be updated to handle HTML5; Google Chrome seems to be the best at this point; see this). With the expansion of the use of smartphones throughout the world, this is incredibly significant. Anyone with a device with a browser can collect data. Personally, in trying to devise ways to collect archaeological data digitally, I purchased software and hardware into the $1000s- and I was doing it “on the cheap”. As my two iPads aged substantially between field seasons, I became increasingly frustrated because the expectation was that I was going to need to purchase new iPads in the near future- I was stuck in the technology treadmill. The system was unsustainable. With Kobo Toolbox, an inexpensive smart phone (yes, they do exist) is all one needs. Data collection becomes much less restricted and the opportunities for collaboration with interested communities, especially in remote places, is much, much greater.
Second, web forms can be used OFF-LINE! That’s right, a WEB form that can be used OFF-LINE. This is essential for nearly all situations in archaeology. Even if you have access to the internet (through nearby WiFi or a cellular data connection), it is likely that your connection will be cut at some point- usually the most inconvenient one- leaving you unable to collect data. Not with Kobo Toolbox. You can continue to collect data; it will upload and synchronize once a data connection has been reestablished. How much data is based upon the browser and settings with the browser; here’s details for Google Chrome. Important update: questions are stored in question banks that can now be shared!
Third, Kobo Toolbox includes an intuitive form design tool via the web. Most importantly, this means that the “learning curve” or threshold is very low. I was able to create basic forms in minutes the very first time I tried the tool. However, for more advanced data collection, the web design tool can be nearly as complicated as one desires. Data types are varied and include everything from alphanumeric to GPS location to images and audio. There are some qualifiers here; for example, barcodes can be collected using the Android app, but not the web form. Data can be collected by radio buttons, check boxes, drop down lists and many other ways. Options include skip logic (i.e. show certain questions based upon the response to previous questions) and validation criteria, both of which increase ease of use and the reliability of the data. Form construction can be even more complex if XLSforms (http://xlsform.org/ )(based upon the open standard XForms; https://www.w3.org/MarkUp/Forms/ ) are used. XLSforms can be designed using the ubiquitous Microsoft Excel (LibreOffice Calc (https://www.libreoffice.org/discover/calc/ ) could be used as well and the file saved as .xls). The tool, therefore, is incredibly easy to use from the very beginning, but can be as complex as the user demands.
How about a quick example? In the fall of 2015, I taught a course entitled “Historical Ecology of the Lehigh Gap” at Muhlenberg College, which was “clustered” with another course, “Degradation and Restoration” taught by my colleague Kimberly Heiman. As a component of this course, we mapped plant distributions along a transect through the Lehigh Gap Nature Center. The Lehigh Gap is a Superfund site contaminated by heavy metals from a zinc factory. The purpose of the transect was to identify different plants that represented degraded or restored communities. I created the form (see example) in an evening and shared the link with the students.
In the field, students pulled up the link on their smartphones. Although there were some hiccups, the form worked like a charm. We had one phone that could not use the forms- we still do not know why. Some students found that one browser was preferable to others (Chrome seemed to be the best) and we had significant issues collecting photos. Except for a few exceptions, GPS coordinates were within their known error (c. 10 meters). Those points that were not located within the expected 10 meters appear to have been random. That is, there was no apparent pattern that would suggest particular phone brands, individual phones, or users that were less accurate than others. The data was exported (via CSV) and imported into CartoDB, which students used to analyze the spatial distribution of plants. The map below shows one day of collection and only shows one plant, sweet birch, which in this case actually shows the effectiveness of prescribed burns on the northeastern portion of the transect (birch tend to pull heavy metals from the soil and reintroduce them into the ecosystem; controlled burns limit the growth of birch).
This was a particularly effective exercise. We were able to collect approximately 75 data points on 10 plants at 10 meters intervals (i.e. a distance of 3/4 kilometers) in approximately 2 hours with 17 students. Data collection required no special tools, but students were able to collect and analyze a rich data set with relatively simple, intuitive tools.
Ok, it may appear that I am trying to “sell” Kobo Toolbox. However, let’s face it, there are some drawbacks to Kobo Toolbox.
First, data entry via a small screen virtually requires that typing be minimized- drop down boxes, radio buttons, etc. are far superior. This means that it is preferable to design these types of data entry into the form, which has the positive effect of standardized data, but also can reduce recording important, narrative data. Really, this is not a drawback of Kobo Toolbox, but of the device. Narrative data would be best collected via audio or text through voice recognition, but neither of these is ideal either.
Second, because Kobo Toolbox is designed around location collected through the device GPS, it is wonderful for survey, but the location aspect is less useful for excavation. If a sub-centimeter RTK GPS was connected to the device (via Bluetooth?) the location aspect of the form would be much more useful for archaeology, but that requires serious expense (or, perhaps not). The device could also be connected via Bluetooth to a total station, etc. for increased locational control. Excavation data is likely better collected via a tablet, rather than a phone.
Third, initially I wanted a tool that could be connected directly to a relational database; at this point, this cannot be easily done with Kobo Toolbox. However, because the format of the Kobo export is a CSV file, the data can be easily synchronized with any database, relational or otherwise with relatively little effort. I am now convinced that this sort of compartmentalization is actually preferable. With such a format, the user can decide how to store, analyze and archive their data. While I now prefer a PostGIS database accessed through a LibreOffice Base, others may prefer to database types or GUIs. Compartmentalization means that no one tool is reliant upon any other, but it does mean that standardized (and preferably open) data formats are required to go between tools. Compartmentalization also means that forms must be designed with the database in mind and vice versa, the database must be designed with the intension that all data will be arriving in CSV files (alternatives include KML and XLS; except images, audio, etc., which must be entered into the database manually).
Fourth, I also initially wanted the ability to modify forms in the field. I was able to do this with my initial Filemaker Pro database, but only because I carried the server into the field. However, I used adjusted forms largely during field testing. Adjusting forms in Kobo Toolbox once they have been deployed is difficult, requires an internet connection and new “projects” must be created with new URLs, etc (side note- you can install Kobo Toolbox on your own server and take it into the field). However, once past the testing phase this may not be important; once data collection has begun in earnest, changing forms is usually a poor idea because it reduces consistency and makes synchronization more difficult.
Although Kobo Toolbox has some important limitations, I now consider these limitations to be beneficial. Open source tools that do one thing extremely well and that use open standards and open file formats are preferable because their output is useful in a wide variety of other tools (and in ways that I cannot even imagine).
Hey – we at KoBoToolbox are very excited to see our tools used for applications well beyond what we first imagined! For everyone, I can confirm that Ben will not be getting a rebate for ‘selling’ KoBoToolbox since our tools are already free and open-source! One update – the new version will support modifying a form that has already been deployed – our original concern was about maintaining the integrity of the data being collected, but we heard from many users about the need to be able to make minor adjustments (like spelling, translation,…). We are also working on better analytic but more on that in a few months!
Thank you for sharing this, Ben! Based upon your recommendation, I am going to use it in my classes next year. I spent the fall having to borrow other people’s fancy hand-held Trimbles because I don’t have a budget to get one, and then having to borrow other people’s fancy computers with Trimble software to download the data off of the Trimble and import it into ArcGIS. It was frustrating to say the least. Thank you again for this great post!
Stacey Camp
Hi
I’m interested in using this for archaeological recording. A couple of problems have arisen as I created the proto-form… firstly, traditionally we would include a sketch drawing as part of each record. Can you draw freehand as a response to a ‘question’ in Kobe toolbox?
Second we would expect a stratigraphic matrix to 3 levels (this context, immediately above and immediately below) i.e. Harris matrix. Aside from having these as individual questions can this information be presented in the way many archaeologists are accustomed to seeing it? Finally, how many photos can be taken as a response to a single question? Just one or multiple?
I suspect the answer might be that we archaeologists need to rethink the purpose of the data we traditionally record, e.g. did we do sketches because photos were not part of the written record, now we can have a photo instead should / could we do away with the sketch? Or does the sketch still have an important role in our recording…
But any further guidance into the capabilities of Kobe toolbox for archaeological recording would be very useful in making those decisions.
Thanks
Gaylynne
Gaylynne,
Thanks so much for the message. The basic answer is no, there is no option for a freehand drawing. However, you can sketch in a different app, then insert the image into Kobo Toolbox. Yeah, I know, it’s not ideal. You can “reconnect” a digital drawing with the Kobo collected data in a database (such as PostGIS) once you come in from the field.
I believe you can only attach one photo to each “question”. So, you could make multiple questions, but you would need to limit yourself… Let’s say you have a “unit” form. You could take 5 photos- one from the north, south, east an west and then one “plan” view. Would you need more? If you want photos for artifacts, you could also standardize those photos so that you have a relatively predictable set of photos. However, let’s say you just want to collect a bunch of photos from a newly located site. I would say that you could have two separate forms- one “site” form and then a photo log in which one the responses to a question is the site number referred to in “site” form. That way, you can reconnect the “flat” files with each other to make a relational database.
If you are interested, see more here: benjaminpcarter.com
Cheers,
Ben