Being able to manage unstructured text is no longer a “nice to have,” as companies and individual users alike have to deal with increasing amounts of unstructured text, work with it, and gain knowledge by interpreting it. Cirilab is a company that provides software products to search, retrieve, and categorize information from unstructured text sources. Read this interesting interview with Ron Carrière, chief executive officer (CEO) of Cirilab, and get his take on the role of Cirilab in the software industry and the company’s approach to the unstructured content management space.
I welcome your thoughts—leave a comment below, and I’ll respond as soon as I can.
Ron Carrière, CEO, Cirilab
Mr. Carrière’s career has included responsibilities as a professional engineer in senior positions at technical and management levels in public and private organizations. He has extensive hands-on experience in hi-tech start-up companies. He was founding CEO of ACDS Graphic Systems Inc. from 1983 to 1988, where he led a multimillion dollar Initial Public Offering (IPO) and stewarded ACDS software to revenue in excess of $20 million (USD) before his departure. In 1990, he founded Nucor Hyper Technologies Inc. and in 1995 Le Centre International de Recherche en Infographie, and was chairman of both organizations. In 2001, he co-founded what would become Cirilab Inc., where he currently serves as president and CEO.
Hello Mr. Carrière. Could you tell us about your experience working with unstructured data?
RC. Working as a geodetic engineer from 1973 to 1978, participated in converting an inertial navigation platform (off a cruise missile) to an inertial survey platform, a huge real-time data crunching machine.
From 1978 to 1983, I tackled an internet service provider (ISP) and global positioning system (GPS) for the first private firm to use GPS, and retrieved massive amounts of data to determine an object’s exact position.
From 1983 to 1989, I served as CEO of a geographic information system (GIS) and computer-aided design (CAD) software company, where there were many complex indexing systems involved, but with a group of 70 programmers, I solved major data management issues for utilities and government clients.
From 1990 to 1997, I worked on satellite imagery processing and sonar pattern recognition of massive unstructured data containing intelligence. I was able to find that intelligence and develop a Semantic Intelligence (SI) Engine.
I then gathered several data scientists to develop our Knowledge Generation Engine to be applied to text—the most underrated unstructured piece of data in all our databases. I partnered with Mark Hurst, chief technology officer (CTO) of Cirilab, who has a background in science and artificial intelligence. Mark is currently focused on text analytics; associative, contextual, and analogical modeling; collaborative analysis; and organization and discovery.
What do you think triggers the need for the management and analysis of unstructured data?
Enterprise wise (risk management) for governance and the belief that analyzing will help, though there is very little return on investment (ROI) and the analysis is associated with huge expenses.
For a consumer, therein lies the “knowledge nugget.” The question is, “How do I find it?” After searching, the consumer expects and deserves to find some information that is of relevance.
What does Cirilab offer for people that need to handle unstructured data?
We have a series of products (semantic tools) and services that provide a means for a “self-describing source,” a Knowledge signature (Ksig) of a document, or a Knowledge map (Kmap) of several documents. These tools allow you to get to a level where you can garner “meta-knowledge,” and be able to carry out knowledge management. You can then process that meta-knowledge to achieve “meta-cognition” and build an intelligence infrastructure. (Don’t worry, I come down to earth in the next few questions.)
What is the difference in focus between your consumer and enterprise products?
For a consumer, the focus is to save time, assist him or her in managing a knowledge base, and filling in the knowledge gaps. For an enterprise, it comes down to benefiting from those individuals using the system. There is a focus on the value of a “corporate memory” garnered from analyzing a knowledge base. The goal is to make use of collaborative efforts (Kmap of blog posts, chats, e-mail, documents, etc.) to facilitate access to knowledge within minutes, and then to save and share that knowledge.
What are the differences between Cirilab and other similar applications?
The main difference is the focus on SI tools—for an individual user and his ROI, a subject very dear to most knowledge workers (other than information management [IM] and information technology [IT]) —and on the unstructured data, the place where each document is allowed to express itself. The document is the algorithm!
What is the most common problem your customers are trying to solve when they approach you?
They need to organize and analyze large amounts of data in a short time. That may be 10 documents (100 pages each) for tomorrow morning, 108,000 chats for next week, or a profile of 1,500,000 documents for a round of litigation. With the soft-as-a-service (SAAS) model, the customers first register, but then only pay when they use the system.
Do you see any specific trend in the space of unstructured data management?
In the early 1990s, we bid, won, and delivered a project on large engineering and unstructured data management. I was leading a swat team of data scientists to deliver that project at 20% of budget and 25% of the allotted time. The solution and savings came from a multidimensional index (GIS type) and from processing only 10% of the data and modeling all knowledge nuggets based on an aerial photo. The results were more than adequate and at great savings.
The present trends in business intelligence (BI) and SI are similar and very promising with so much data being produced today.
What is your take on so-called big data? How do you think an organization should approach the issue of managing huge amounts of data, especially when it’s unstructured?
It should be one user, one folder at a time, with the prime design parameters being “relevance” and “timeliness.” The key to a semantic index is to identify what may be useful in a future discovery. You will satisfy 80% of the current needs at 20% of the costs—and still have the ability to drill down to satisfy more details should you choose to process more.
There is always some discussion about how to measure the ROI of an application, especially in the area of data management. What is your take? What is the real value of an application like yours?
Saving time in reaching a decision is both valuable to the individual and the enterprise. A simple speed-read function summarizes 40 pages in seconds and gives ‘corporate memory’—a knowledge tag of things that are happening.
What is your favorite movie?
Star Trek—I think DATA had our software in his head.
Do you prefer dogs or cats?
We had a barn full of both on a farm I used to have. I do love my grandkids!!!