The Cost of Quality in the Laboratory

As most laboratory personnel know, errors within the laboratory can be harmful.  In clinical diagnostics, technicians are trained to understand that the results produced by laboratories are used to treat patients and that errors can be fatal to these patients.  Similar consequences can be found in other industries.

With that understanding, laboratories invest in quality, in fact, sometimes not quite understanding the cost of the actions performed to achieve (supposedly) a certain level of quality.  Many of the quality procedures in the laboratory are associated with satisfying minimum regulatory requirements deemed to yield the quality necessary to be licensed.

But does your laboratory really know the true cost of quality and is the level of investment in achieving that quality a conscious investment or the default of the process adopted?  Knowing these costs are important when implementing quality improvement processes, including automation.  Let’s review some of the elements of quality costs to help you quantify these costs in your laboratory.

The cost of quality is any cost that would otherwise not have been spent if quality were perfect. The idea of perfect exists in theory, but in practice it is difficult if not impossible to obtain.  Therefore, there will be a cost associated with poor quality.  Significant chunks of this cost remain hidden because accounting systems are not designed to identify them, and thus these costs are buried in routine operational costs. Getting a handle on these costs helps identify areas of opportunity for improvement within a laboratory leading to a reduction of the cost of errors.

Client Complaint Costs (External Failure Costs):  When your client (doctor, scientist, regulator) calls you to question the validity of the laboratory result there are a set of actions that triggers the laboratory response.  Among these actions  are researching how the result was obtained, the state of the sample at testing, and other possible mishandlings within the testing and reporting process.

The costs of client complaints are varied; some are tangible:

  • The effort researching how the error occurred
  • Correcting the error, including retesting, if possible
  • Regulatory fines when applicable
  • Lawsuit expenses when applicable

Some are intangible:

  • Potential loss of revenue from the  complaining client’s account
  • Damage to the reputation of the laboratory and loss of other business
  • Expense of repairing quality image

Quality Control Costs (Internal Failure Costs): Most laboratories associate this expense to the cost of quality. Again, the costs are tangible and intangible.

Among the tangible costs are:

  • The effort and material incurred in QC and standard samples
  • The effort required to review results – many laboratory processes require two people  (sometimes three) to review the results so as to catch any erroneous results
  • The effort and expense of rerunning the tests if found in error by the review process

Among the intangible costs are:

  • Decreased service level (turn-around-time) that leads to loss of revenue
  • Decreased instrument capacity through the inability of the review process to keep up with the instrument output

Inspection Costs: These are the costs associated with monitoring compliance with regulatory requirements or the expected level of quality.  These costs include:

  • Effort and material to calibrate equipment
  • Effort to monitor quality of laboratory processes, including proficiency sample processing
  • Effort of internal audits in preparation for external audits, including audits from regulatory agencies

Prevention Costs: These are the costs invested to prevent errors from occurring.  Among these costs are the following:

  • Effort to develop and maintain a quality system, including documentation of standard operating procedures
  • Enrollment in quality surveys and other comparable quality prevention measures
  • Efforts spent in method development to ensure the production testing process produces quality results at a higher testing volume (it’s scalable)
  • Development of quality rules to ensure results comply to expected quality
  • Technician training

So, using these four elements, can you estimate the cost of quality in your laboratory?  Is it higher or lower than you expected? Can you find areas that would have a significant impact on quality and reduce cost?

In my previous blog (The Human Side of Automation), I talked about the effects of automation on personnel.  Another component of the story I shared on that blog was many of the savings achieved were the result of understanding and calculating the savings on the cost of quality. I remember from that project that the accounting department was insufficiently prepared to calculate the cost of quality, it required extensive input from people in the laboratory who knew the processes.

I hope you find this helpful, especially if you are considering the implementation of a quality improvement step into your process and want to justify it to your management.

I invite you to share your insights on the cost of quality in your laboratory or share a story of how some quality improvement decision produced significant savings.

Share

Traversing the Murky Depths of Inkscape

As a fledgling HCI Professional / Interface Designer, using a new piece of software is always an interesting and informative experience. Since beginning my college career, I’ve become significantly more enthralled by the idea of an interface working to better the overall usability of a program. Because of this, from my perspective, programs are effectively split in half: one half exploring how the software functions as a whole, the other involving the interface and how logical its design is. This two-part evaluation system really allows for a user to get a feel for how satisfying or frustrating a piece of software will be to use. With this knowledge, a user can then determine how to best proceed in terms of using that program. This could involve diving into a manual to learn more about usage or abandoning the application altogether and attempting to find a better solution.

A few weeks ago I was tasked with converting a rather daunting stack of data-flow diagrams into a digital format. There are a couple of choices for programs that would be best for this type of work (such as Visio on the PC and OmniGraffle on the Mac), but Indigo needed a solution that would have cross platform functionality to prevent issues with editing files down the road. With this requirement in mind, some quick research pointed in the direction of Inkscape (http://inkscape.org/), a free, open-source SVG editor with some pretty robust tools for making data-flow diagrams. The program appeared to be pretty well put together and suited for the type of work I would be doing. After quickly becoming acquainted with the tools, I started drawing. A few weeks and 116 diagrams later, I think it’s fairly safe to say I’m at least moderately experienced with the software. My overall feeling is that the software is, in most respects, very well suited to creating data flow diagrams. After using the program so extensively, I compiled a brief overview of the program’s pros and cons along with some useful tips about the application to assist you before you begin to diagram using Inkscape.

Pros:

  • Inkscape does a fantastic job of dealing with lines and curves. The Bezier Curve tool is easy to use and modifying curves / lines is very intuitive.
  • Because it is a vector based program, Inkscape can perform smoothing on hand-drawn lines using the Freehand Line tool. This allowed me to create the Gaussian Peaks used in some of the diagrams.
  • Similar to Photoshop, Inkscape has rulers at the top and side of the working window. The user can click and drag on these rulers to create guides, used to help align elements within the document.
  • The user is provided with an extensive collection of line strokes / elements that can be used in the creation of flowcharts. (Dotted / Dashed / Arrows / etc.)
  • Preset rotation amounts make it easier to align text with slanted lines. (More on this in the tips section)

Cons:

  • The biggest downfall of Inkscape is an interface in which certain actions can only be performed via key commands that aren’t very intuitive. For example, there is no clear way to rotate objects using the interface. Instead, a seemingly random key command is used (explored in the tips section) to handle rotation.
  • Initially, the way the program handles formatting was confusing. Text alignment didn’t seem to remain consistent. There are some tips regarding this in the section below.
  • There is no horizontal scroll bar that affects the view of your workspace. Instead, the horizontal scroll along the bottom pans left and right on the color selector located above it.
  • It appears that occasionally the program will render lines at different thicknesses, even if they have the same weight applied to them.

Tips and Tricks:

  • To scroll horizontally in the workspace, hold down the shift key and use the scroll wheel on the mouse. This was the only way I found I was able to scroll left and right.
  • To rotate a line / object / text box, hold down the alt (Windows) or option (Mac) key and use the open and close bracket keys ( [  and  ] ) to rotate left and right. This is a precision rotation used to line up objects with other objects.
  • For a controlled rotation, objects can be rotated fixed amounts by using certain keys in conjunction with the open and close bracket keys. Use the control key to rotate objects by 90 degrees and the Windows / Command key to rotate objects by 15 degrees.
  • When using the Bezier Curve tool to create a straight line, the control key can be held down to force the created line to apply to certain angles. These fixed angles correspond with the Windows / Command key rotation function mentioned above. This allows you to quickly align text and straight lines using the fixed 15 degree increment values.
  • The control key can also be used to create shapes that adhere to predefined ratios. This allows for easy creation of perfect circles / squares.
  • Regarding formatting, the program seems to use the following method to determine defaults. It is rather confusing, so I will explain by using an example.
    • Let’s say you create a text field and type something into the program. After typing, you decide to center the text within the field.
    • If you now create another text field, the program will revert back to left-justified, the default setting.
    • However, if you create a text field and set the formatting to be centered before inputting any data into the field, Inkscape will now apply centered as the new default for all future text fields. Subsequent text fields will be centered upon creation.
    • In general, Inkscape tends to set formatting defaults based on changes applied to empty fields. This can take a while to get used to and can become frustrating.
  • Inkscape includes the option to save your files as PDFs. However, if you are working with a large number of files and wish to maintain the .svg files for later editing as well, I would suggest waiting until all your files are created and then using a program to batch convert from .svg to .pdf.

This is by no means a comprehensive guide to Inkscape, merely a short collection of thoughts and tips I collected while I was working with the program. Hopefully you found some of this information useful! Thanks!

Share

The Human Side of Automation

As technology continues to improve, the opportunities for automation are increasing, sometimes in a dramatic fashion.  Many companies and managers view these as opportunities, while others  view automation as a harbinger of difficult decisions, including decisions to eliminate positions.

As managers, when thinking about technology, most agree on two things:

  1. Automation technology is progressing fast.  This process simplifies not only blue collar, but also white collar jobs.
  2. The availability of these technologies will narrow the gap with one’s competitors so choosing not to adopt automation is not an option.

So, what makes managers hesitant about these automation decisions?

Probably the most important reason for adoption delays is fear of change.  I contend that an important component of that fear is the unpleasant thought of eliminating people’s jobs with automation.  What plan can be implemented to minimize such a bad side effect and embrace progress?

My first job out of college placed me into a project that would replace a batch system with online system eliminating the need for key punch operators.  The company business analysts made a strong case regarding competitive advantage by reducing turn-around-time as well as adding benefits of cost reduction.

Working with the laboratory supervisors, I understood the benefits, but also noticed their concerns about changes the technology would bring.  One such concern was what to do with the surplus workers.

I brought this observation to the project steering committee, which included an old wise CFO.  He had been instrumental in laying out the advantages the planned automation would bring to the company:

  • It would decrease direct costs for sending out results and
  • It would decrease turn-around time allowing clients to receive laboratory results sooner

In dealing with the concerns of the supervisors, he explained a plan to deal with these personnel reductions. The plan included the following:

  • Establish a hiring freeze in the affected departments;
  • Examine back logged projects that could use the affected staff and project their re-allocation; and
  • Identify current openings within the laboratory and re-allocate the affected staff to those areas.

The CFO also advised the executive team on the use of the savings in the direct costs and re-allocation of those savings to three areas:

  • Investment in marketing and sales to drive more business by using the newly gained competitive advantage;
  • Investment in other infrastructure projects to further increase capacity; and
  • Investment in R&D to bring more tests to market.

In the next meeting with the laboratory supervisors, the CFO laid out the plan, which was enthusiastically embraced. The project gained speed and was completed very successfully.

In essence, what the CFO did was transform a stressful change into an opportunity.  He engaged the lab management and gave them an opportunity to benefit from the change.  He also clarified the  importance of using the savings to pursue business growth to the executive team.

Over the years, I have been fortunate to participate in many high growth companies.  Inevitably, at various points, these companies needed automation to sustain their growth.  The guiding principles derived from the advice of that wise CFO have served me well.  I hope these same principles assist your innovative decision making.  I invite you to share any other ideas from your own experience.

 

Raul Zavaleta, CEO

Indigo BioSystems

Share

How to get Condor running on Mac OS X

Indigo delivers high performance data analytics for diagnostic laboratory operations.  Most of the heavy lifting is done using the Condor High Throughput Computing system from the Condor Team at the University of Wisconsin.  Condor is a very powerful piece of software which simplifies parallel computing for “embarrassingly parallel” problems.  Automated instrumental data analysis can usually be pipe lined into such an architecture with excellent performance results.

We use condor on both Amazon EC2 with Ubuntu and on customer systems (mostly VMWare).  There are several good starting images for Condor for Amazon EC2 and getting condor running on most Linux distributions is simple.  We do most of our development on Mac so I wanted a simple condor master that could accept flocking nodes (if I needed more compute power) for developing and testing.

Here’s how I set my own system up:

Use the /etc/launchd.conf file to set the path to the Condor executables and set the required CONDOR_CONFIG environment variable.  This file might not exist on your machine, so create it if don’t have one already.

setenv PATH /export/condor/bin:/export/condor/sbin:$PATH
setenv CONDOR_CONFIG /export/condor/etc/condor_config

When you reboot your machine, the global path will be properly set.

I run condor as a normal user (me) with the condor_master command, so I want all the condor binaries on my execution the path.  Also, I want condor to use my specific condor_config and condor_config.local, so I set the environment variable CONDOR_CONFIG to the main configuration file and it points to condor_config.local.

I expanded the installation tarball (condor-7.5.6-x86_macos_10.4-stripped.tar.gz from the Condor Download Center) into a directory,  I used /export/condor.  You can put this wherever you like, but this directory will be entered into the configuration script so just make sure the directory is consistent throughout.  Later, I will show that I put all the configuration and logging directories in the same place.  You don’t have to make all these changes, but I wanted everything in a nice, neat place.

Next, I put the condor_config and condor_config.local files in the /export/condor/etc directory.

You can use the vanilla condor_config that come with the distribution, but you have to change the following sections: first around line 56 – you can leave the CONDOR_HOST = $(FULL_HOSTNAME):

##--------------------------------------------------------------------
##  Pathnames:
##--------------------------------------------------------------------
##  Where have you installed the bin, sbin and lib condor directories?
RELEASE_DIR       = /export/condor  

##  Where is the local condor directory for each host?
##  This is where the local config file(s), logs and
##  spool/execute directories are located
LOCAL_DIR         = /export/condor/etc  

##  Where is the machine-specific local config file for each host?
LOCAL_CONFIG_FILE = /export/condor/etc/condor_config.local

Just put in the location for your installation here.

I also changed my execution settings to something like the Condor “TESTINGMODE”. That just means that I don’t want jobs suspended, killed or stopped if I use my computer:

# When should we only consider SUSPEND instead of PREEMPT?
WANT_SUSPEND      = False

# When should we preempt gracefully instead of hard-killing?
WANT_VACATE       = False

##  When is this machine willing to start a job?
START             = True

##  When to suspend a job?
SUSPEND           = False

##  When to resume a suspended job?
CONTINUE          = True

##  When to nicely stop a job?
##  (as opposed to killing it instantaneously)
PREEMPT           = False

##  When to instantaneously kill a preempting job
##  (e.g. if a job is in the pre-empting stage for too long
KILL              = False

PERIODIC_CHECKPOINT     = False
PREEMPTION_REQUIREMENTS = False
PREEMPTION_RANK         = 0
CLAIM_WORKLIFE          = 1200

I also found that it was easier to just create the following directories:

/export/condor/etc/spool
/export/condor/etc/run
/export/condor/etc/execute
/export/condor/etc/log
/export/condor/etc/lock

and change the following in “Part 2”

LOCK        = $(LOCAL_DIR)/lock

and “Part 4” of the configuration file:

######################################################################
##  Daemon-wide settings:
######################################################################

##  Pathnames

RUN         = $(LOCAL_DIR)/run
LOG         = $(LOCAL_DIR)/log
SPOOL       = $(LOCAL_DIR)/spool
EXECUTE     = $(LOCAL_DIR)/execute

Also if you don’t want the system to run a benchmark (I turned this off), you can comment out the RunBenchmark line lower down in Part 4:

##  When a machine unclaimed, when should it run benchmarks?
##  LastBenchmark is initialized to 0, so this expression says as soon
##  as we're unclaimed, run the benchmarks.  Thereafter, if we're
##  unclaimed and it's been at least 4 hours since we ran the last
##  benchmarks, run them again.  The startd keeps a weighted average
##  of the benchmark results to provide more accurate values.
##  Note, if you don't want any benchmarks run at all, either comment
##  RunBenchmarks out, or set it to "False".
#BenchmarkTimer = (time() - LastBenchmark)
#RunBenchmarks : (LastBenchmark == 0 ) || ($(BenchmarkTimer) >= (4 * $(HOUR)))
#RunBenchmarks : False

Next, I don’t want to create a special “condor” user on my laptop, I just want to run condor as “me”.  To do this, I need my user and group id:  From the command prompt run the id command:

uid=501(“me”) gid=20(staff) groups=20(staff),…

I then use the following condor_config.local:

CONDOR_IDS = 501.20
START_MASTER = True
START_DAEMONS = True
START = TRUE
FLOCK_FROM = *
HOSTALLOW_READ = *
HOSTALLOW_WRITE = *
CONDOR_HOST = $(FULL_HOSTNAME)
DAEMON_LIST = MASTER, SCHEDD, STARTD, NEGOTIATOR, COLLECTOR
TRUST_UID_DOMAIN = TRUE

I use the condor_config.local to override anything I didn’t fix (or forgot to fix) in the condor_config file.  The CONDOR_IDS allows me to run condor_master as me and allows other nodes to flock to my machine as a master.
If everything went right, you should be able to run the condor_master command followed by the condor_status command.  For my MacBook Pro I get:

Name           OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime
slot1@rkj      OSX        X86_64 Unclaimed Idle     0.310  2048  0+00:28:31
slot2@rkj      OSX        X86_64 Unclaimed Idle     0.000  2048  0+00:28:32
slot3@rkj      OSX        X86_64 Unclaimed Idle     0.000  2048  0+00:28:33
slot4@rkj      OSX        X86_64 Unclaimed Idle     0.000  2048  0+00:28:34

Total Owner Claimed Unclaimed Matched Preempting Backfill

X86_64/OSX     4     0       0         4       0          0        0

     Total     4     0       0         4       0          0        0

Your display will vary by memory, cores, and how busy your machine is at the moment.

Your system should now be ready for condor_submit

Share

Bringing the web inside the lab

Do More With Less

There was a time in the pharmaceutical industry when resources seemed to be in limitless supply. If that was ever really true, those times are certainly gone now. Indigo’s products are designed to allow laboratory staff to do more with less. And not just a little more but A LOT more – and do it better. Equipment automation helps, but eliminating wasted time finding and manually manipulating data is equally important. It’s a little sad that with all the money that has been spent on data archiving, storage and electronic notebooks that more productivity enhancement hasn’t been observed. Despite millions spent by technology groups supporting laboratories, data is still too hard to manage, find, organize and analyze.

Some reasons for this are summed up by Brian Sletten in “Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design”, edited by Diomidis Spinellis and Georgios Gousios:

It is with great shame that we as an IT industry must acknowledge this embarrassing fact: it is easier for most organizations to find information on the Web than to find information in their own systems. Think about that for a moment. It is easier for them to locate data, through third parties, on a global information system than to do so within environments in which they have complete control and visibility. There are many reasons for this travesty, but the biggest problem is that we tend to use the wrong abstractions internally, overemphasizing our software and services and underemphasizing our data.

Indigo has deliberately taken the world-wide-web playbook and made it the central premise of our products. We believe that it is worth emphasizing data in the architecture of a laboratory system since this is what is generated and what will make or break a system in real world scientific work.

Sletten, who is a respected IT consultant, suggests the value of solving this problem:

…an information-focused architecture in the Enterprise demonstrates some of the same positive properties as the Web: scalability, flexibility, architectural migration strategies, information-driven access control, and so on. In the process, it empowers the business side of the house to make capital investment and software development decisions based on business needs, not simply because fragile technology choices require them to pay for flux.

We at Indigo think that empowering the ‘business side of the house’ will enable drug companies to improve their pipelines by freeing scientists to be more productive with information systems. We believe that as an information generating community, pharmaceutical scientists have one of the most important and difficult tasks: turn oceans of uncertain data into therapies which improve lives. Shifting the design emphasis to data and information instead of software, infrastructure, hardware and services, will improve the productivity of the drug industry. In other words, stop worrying so much about what languages, servers, protocols and software technologies are used and start worrying about how the data are represented, analyzed and used.

An Example

The basic idea is to make accessing internal lab data look more like accessing data on the Web. This sounds nice, but is actually tricky. It means doing what the web does: first, give everything a Web-like address and second, translate addresses into physical representations matching what is expected. As a tutorial on this approach I will walk through some examples from BLAIS Proteomics Center an the Dana-Farber Cancer Institute for displaying proteomics analysis and analytical data.

Look at the following URI (Universal Resource Identifier):

http://blaispathways.dfci.harvard.edu/mzServer/files/FLT3_iTRAQ/scans/55.603.html

http://blaispathways.dfci.harvard.edu/mzServer – is the name of the server.  The next part: /files – tells the service to access a file, in this case one named FLT3_iTRAQ then /scans says we are interested in looking at a specific scan, in this case we specify the acquisition time of the scan 55.603 – this tells the server to retrieve the scan at 55.603 minutes. The next part is very cool. By specifying 55.603.html, we are telling the server we want to see the scan at that time rendered as an HTML page. I could have said: 55.603.jpg and it would have returned a JPEG file containing an image of the scan.

I don’t have to get the time exactly right either, I could just say 55.6.html and it would work, because the server in not looking for a file of that name (like a traditional Web server would), but passing the ‘request’ for something like 55.6 into an algorithm which will retrieve the needed scan – in this case it just grabs the closest scan to what we asked for, which is what we really meant.

Queries are therefore constructed by users or applications by assembling URI strings rather than making complicated calls to complex Web Service or database technologies. Here’s another example:

http://blaispathways.dfci.harvard.edu/mzServer/files/FLT3_iTRAQ/ric/52.603-58.603/732.077-732.097.html

This request computes a “Reconstructed Ion Chromatogram” (ric) from the specified file displaying the time range 52.603-58.603, summing the ion signal from masses between 732.077 and 732.097 and then returns that chromatogram as an HTML page.

This proteomics example reveals another important aspect of using web addresses: agreement on nomenclature and terminology.  For example, the vocabulary used in the address has to be standardized (you have to use the /files, not /data or even /file). The units of the time elements and the mass elements must also be clearly defined and understood before this interface can be used.  Once a standardized naming convention is established, however, it’s much easier to use this interface than something made from SOAP, WS-*, SQL or any number of other technologies.   It’s not about the technology, it’s about the data.

Conclusion

Indigo makes data the central idea in the automation of laboratory tasks.  We believe that the focus on the content and hiding the technical details is what makes the web great and what will help make more usable, scalable and valuable systems for lab scientists.

Contact us if you would like to bring the web to your lab and find, analyze and use data from inside your company as easily as you use the web.

Share

Why there are so many legacy applications in pharma and what to do about it

Computers have now been used to collect, store and perform calculations on laboratory data for over 30 years. Since the record retention policies for most regulated functions in a drug company state that records must be kept for at least 35 years, some of what was collected in the early days of computerized laboratory operation must still be kept.

During the 1990s the Food and Drug Administration recognized that computers, being used to collect so-called “electronic records”, were fundamentally different from measurement systems which printed to paper.  A device which printed only to paper, and kept no internal record, could be used to meet FDA evidence requirements if the paper were signed, dated and secured against modification.  Computer systems were expected to meet the same requirements, despite a lack of tools to ensure electronic files were as valid as the earlier paper outputs had been.

IT departments, instrument vendors and laboratory software vendors struggled with the problem of computer system validation and the security of electronic records through the 1990s and into the 2000s.  The primary response was to create a heavyweight, system-wide validation to ensure that regulated computerized systems were tested and compliant with the new agency guidance known as 21 CFR Part 11.  Once validated, data could be stored or managed electronically as long as the validation was maintained through system upgrades and modifications and as long as the application was operational.  Data could be migrated from one system to another only through extensive validation of both the new application and the migration tools.  The expense of migration and validation gave an incentive to keep regulated systems operational over extended periods of time.  It also created a situation where the number of systems used by an organization increased, as it became more cost effective to transition from one system to another by leaving old data “retired” in-place.

As the use of computerized laboratory systems increased through the expansion of applications such as Laboratory Information Management Systems (LIMS) and Electronic Laboratory Notebooks (eLN), the demand on IT support groups also increased. New, more complex systems based on relational databases such as Oracle were added on top of existing ‘legacy’ systems.  The support and maintenance costs to IT departments across the industry have become staggeringly high.  Ironically, while customer expectations were being raised by the expansion of retail, mass market, computer technology, IT departments saddled with decades of system acquisitions, spent most of their resources maintaining older, less valuable systems. Every drug company can make a graph like this:

Application cost benefit plot

A tipping point has come with the latest wave of industry consolidation and healthcare reform legislation.  Newly merged companies found themselves with expensive, duplicate functionality  provided by incompatible, non-integrated, legacy systems during a time when margins and market cap plunged.  Solving the integration problem and reducing the cost of maintaining legacy systems, so that more resources could be devoted to driving innovation in drug discovery and development has shifted from a nice-to-have, to a must-have.  

  • How can regulated data, which must be accessed using some part of a laboratory application be migrated from a legacy system without losing the minimum functionality needed to meet regulatory requirements?
  • How can data stored in multiple, incompatible data models in a variety of database systems be integrated so that the assets of merged companies can be properly combined to help improve product pipelines?

Indigo BioSystems has attacked this problem and produced an elegant solution to both questions.

As an integrated laboratory data management and analysis system, INDIGO was designed to store the data from any laboratory operation – with the goal of organizing and integrating information in order to support automated analysis.  INDIGO uses a very abstract database system, which flexibly stores relationships between data objects, as opposed to storing data in fixed structures as is done in most legacy applications and relational database systems.  INDIGO also allows open format raw data to be stored, since direct access to raw data is critical to most data analysis tasks.  By using open format data and storing data extracted from relational databases in a ‘semantic’ model where data is stored and annotated for meaning, INDIGO ensures that data can be searched and found indefinitely.  Finally, to perform data analysis, INDIGO provides full analysis tools, including complex computational capabilities used to visualize and report on data and results.

Indigo BioSystems combined all of these capabilities into a single system following design principles pioneered by large-scale data management operations at Google, Amazon and Yahoo.  By insisting that INDIGO be capable of large-scale scientific analysis on the order of these internet giants,  our engineers avoided the design errors which now plague almost all enterprise software: poor performance due to inherent lack of scalability.  At a time when the drug industry badly needs better productivity, the most frequently cited culprit is slow performing software systems, many purchased to speed the very processes they are slowing with poor scalability.

By extracting data from legacy systems and storing it in INDIGOs massively scalable storage system and replacing the critical search, calculation and reporting function of legacy systems, it is now possible to truly retire legacy systems and simultaneously integrate their data to produce information aggregation.  Some or all of a legacy systems functionality can be implemented using the scripts and workflow engine supported by INDIGO.  Data can be extracted into “resources” without performing extensive reverse engineering of legacy data models.  A “Resource Oriented Architecture” allows links to be formed between data items as needed to create a dynamic data model which grows more searchable and  valuable over time.

Share

Let the Data Flood Begin: Full Scan PK at MSACL

Development of a Small Volume Sampling Technique and LC-MS Orbitrap Assay for Pediatric Pharmacokinetic Studies of Fentanyl and its Metabolites (Uwe Christians, Clinical Research and Development, Department of Anesthesiology, University of Colorado Denver).  In this talk, Uwe showed the sensitivity and selectivity of using full scan data for PK type data.  Because you don’t have to select a specific transition, you get a full spectrum during each aquisition which can be interrogated later for metabolites that may not have been part of an initial hypothesis.  Kevin Bateman from Merck showed this type of experiement a few years ago at ASMS, but it appears that the Oritrap can really do this experiment.  This blows up the amount of data collected in a PK study by at least an order of magnitude, and it increases the value of data stored for long term access by at least as much.

Share

Leroy Hood at MSACL: Data reduction will be the key to personalized medicine.

Dr Hood suggests that signal-to-noise in biological measurements is so bad that we must use 1) Statisics 2) a deep understanding of the pathways and 3) data integration; to make any progress. All of his data is shown as networks – perfectly aligned with the large-scale linked data approach Indigo uses. Hood said: “Medicine is becoming an information science.” If that’s true, new approaches to informatics and IT will be essential.
Sent from my BlackBerry
Share

Indigo’s Cloud Provider Highlighted as High Quality Security Example

Indigo BioSystems is now using BlueLock LLC to provide hosting/infrastructure for the Indigo Platform and our Software-as-a-Service offerings.  Indigo selected BlueLock for security-intensive applications in the pharmaceutical industry and apparently we are not alone.

From ComputerWorld: “Cloud security: Try these techniques now

BlueLock’s virtualized environment allowed data and volumes to move between systems in a dynamic, low-cost way that would be impossible with a traditional, hosted environment, Westgate says.

There were, however, security concerns to be addressed before Logiq³ would entrust its critical systems to BlueLock’s cloud. The life reinsurance company handles death records, which include personal information like social security numbers, as well as financial data and information about major assets that its large financial customers have on their books. Although Logiq³ isn’t regulated by the U.S. government’s Sarbanes-Oxley Act, its customers in the financial sector are, “so they’ll be auditing us,” says Westgate. As a result, Logiq³ needed potential cloud vendors to demonstrate that they were in compliance with applicable regulations and could provide high levels of security.

The thing we like about BlueLock is the data protection architecture and the ability to perform audits while still achieving the elasticity and location transparency need for SaaS.  We too are audited by our customers to ensure our applications protect data and prevent tampering.  The idea of separating roles is key to security in externally hosted systems.  Our approach discussed at the ALA Conference takes the separation one step further by encrypting the data so that neither the Indigo admin’s nor the BlueLock admin’s have the needed keys to access customer data.

Encryption adds to the security enabled by the “division of labor” described in the article:

The division of labor between Logiq³ and BlueLock actually strengthened security, because “no one person, or company, has all the keys to the kingdom.” says Westgate. Because BlueLock manages the firewall, for example, “none of my admins can go in and decide to sell or move the data,” he notes. “And BlueLock admins can’t do it either, because they don’t control the systems.”

Audits and accreditation are also needed because as good as this all sounds it won’t work if the SOPs are not being followed, or if there are holes in the procedures.

Therefore, due diligence is critical, Anderson says. Pfizer uses SAS 70 Type 2 certification, in which an independent third party audits the service provider’s internal and data security controls. Anderson also verifies the vendor’s level of Safe Harbor compliance and checks Dun & Bradstreet research to make sure it’s legitimate, he adds.

Another standard by which to evaluate a service provider is ISO 27001, which defines best practices for designing and implementing secure and compliant IT systems.

While such standards provide a useful starting point, their criteria tend to be generic, says Gartner’s Heiser. Companies still need to match a service provider’s specific controls to their specific requirements, he adds.

For example, after checking out BlueLock’s SAS 70 Type 2 accreditation, Logiq³’s IT staff did a further evaluation to “make sure the controls we require are supported by the controls they have in place,” Westgate says. His team then followed up on discrepancies, identifying missing controls and working with the vendor on solutions. The company plans to repeat the process at least once a year, he says.

It is clear that shared services and externally hosted data are a part of pharma’s future.  Indigo is working hard to make sure that its customers gain the benefits of this new approach while minimizing the risks.

To read more of what we are up to, check out our website and blog.

Share

Pfizer and Indigo Discuss Shared Services on ALA 2010 Informatics Panel

Download now or preview on posterous

ALA 2010 Panel Julian.pdf (250 KB)

I served on a panel in the informatics track at the Lab Automation Conference last week with people from Pfizer.  We were each allowed a couple of slides to do an introduction to a key point.  My slides are attached to this post.  The interesting thing to me was how aligned Indigo and the Pfizer scientists were on the use of shared services to improve productivity in research.  The idea expressed on the panel was that we can make our relationships with collaborators much richer by making data “location transparent” and the computational resources needed to process them “elastic”.  These are the two main promises of so-called cloud computing infrastructures.  They key is to encrypt everything in the shared service using security standards developed by other industries to ensure data protection while gaining the ‘elasticity’ and ‘location transparency’ by allowing selective access to data to those who need it.

The key idea expressed by the audience was that data security is the top concern of research organizations considering or using shared infrastructure.  I was delighted that there was strong agreement between Indigo and Pfizer on how to solve this problem and that the benefits would be an increase in productivity for everyone.

Share