Bringing the web inside the lab

This post was written by Randy Julian on June 3, 2010
Posted Under: Accessible Data, Linked Data

Do More With Less

There was a time in the pharmaceutical industry when resources seemed to be in limitless supply. If that was ever really true, those times are certainly gone now. Indigo’s products are designed to allow laboratory staff to do more with less. And not just a little more but A LOT more – and do it better. Equipment automation helps, but eliminating wasted time finding and manually manipulating data is equally important. It’s a little sad that with all the money that has been spent on data archiving, storage and electronic notebooks that more productivity enhancement hasn’t been observed. Despite millions spent by technology groups supporting laboratories, data is still too hard to manage, find, organize and analyze.

Some reasons for this are summed up by Brian Sletten in “Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design”, edited by Diomidis Spinellis and Georgios Gousios:

It is with great shame that we as an IT industry must acknowledge this embarrassing fact: it is easier for most organizations to find information on the Web than to find information in their own systems. Think about that for a moment. It is easier for them to locate data, through third parties, on a global information system than to do so within environments in which they have complete control and visibility. There are many reasons for this travesty, but the biggest problem is that we tend to use the wrong abstractions internally, overemphasizing our software and services and underemphasizing our data.

Indigo has deliberately taken the world-wide-web playbook and made it the central premise of our products. We believe that it is worth emphasizing data in the architecture of a laboratory system since this is what is generated and what will make or break a system in real world scientific work.

Sletten, who is a respected IT consultant, suggests the value of solving this problem:

…an information-focused architecture in the Enterprise demonstrates some of the same positive properties as the Web: scalability, flexibility, architectural migration strategies, information-driven access control, and so on. In the process, it empowers the business side of the house to make capital investment and software development decisions based on business needs, not simply because fragile technology choices require them to pay for flux.

We at Indigo think that empowering the ‘business side of the house’ will enable drug companies to improve their pipelines by freeing scientists to be more productive with information systems. We believe that as an information generating community, pharmaceutical scientists have one of the most important and difficult tasks: turn oceans of uncertain data into therapies which improve lives. Shifting the design emphasis to data and information instead of software, infrastructure, hardware and services, will improve the productivity of the drug industry. In other words, stop worrying so much about what languages, servers, protocols and software technologies are used and start worrying about how the data are represented, analyzed and used.

An Example

The basic idea is to make accessing internal lab data look more like accessing data on the Web. This sounds nice, but is actually tricky. It means doing what the web does: first, give everything a Web-like address and second, translate addresses into physical representations matching what is expected. As a tutorial on this approach I will walk through some examples from BLAIS Proteomics Center an the Dana-Farber Cancer Institute for displaying proteomics analysis and analytical data.

Look at the following URI (Universal Resource Identifier):

http://blaispathways.dfci.harvard.edu/mzServer/files/FLT3_iTRAQ/scans/55.603.html

http://blaispathways.dfci.harvard.edu/mzServer – is the name of the server.  The next part: /files – tells the service to access a file, in this case one named FLT3_iTRAQ then /scans says we are interested in looking at a specific scan, in this case we specify the acquisition time of the scan 55.603 – this tells the server to retrieve the scan at 55.603 minutes. The next part is very cool. By specifying 55.603.html, we are telling the server we want to see the scan at that time rendered as an HTML page. I could have said: 55.603.jpg and it would have returned a JPEG file containing an image of the scan.

I don’t have to get the time exactly right either, I could just say 55.6.html and it would work, because the server in not looking for a file of that name (like a traditional Web server would), but passing the ‘request’ for something like 55.6 into an algorithm which will retrieve the needed scan – in this case it just grabs the closest scan to what we asked for, which is what we really meant.

Queries are therefore constructed by users or applications by assembling URI strings rather than making complicated calls to complex Web Service or database technologies. Here’s another example:

http://blaispathways.dfci.harvard.edu/mzServer/files/FLT3_iTRAQ/ric/52.603-58.603/732.077-732.097.html

This request computes a “Reconstructed Ion Chromatogram” (ric) from the specified file displaying the time range 52.603-58.603, summing the ion signal from masses between 732.077 and 732.097 and then returns that chromatogram as an HTML page.

This proteomics example reveals another important aspect of using web addresses: agreement on nomenclature and terminology.  For example, the vocabulary used in the address has to be standardized (you have to use the /files, not /data or even /file). The units of the time elements and the mass elements must also be clearly defined and understood before this interface can be used.  Once a standardized naming convention is established, however, it’s much easier to use this interface than something made from SOAP, WS-*, SQL or any number of other technologies.   It’s not about the technology, it’s about the data.

Conclusion

Indigo makes data the central idea in the automation of laboratory tasks.  We believe that the focus on the content and hiding the technical details is what makes the web great and what will help make more usable, scalable and valuable systems for lab scientists.

Contact us if you would like to bring the web to your lab and find, analyze and use data from inside your company as easily as you use the web.

  • Share/Bookmark

Add a Comment

required, use real name
required, will not be published
optional, your blog address