Do you think Big Data and noSQL are the last and coolest trend in data world? No way. Software architects and geeks are sleepless to find new and unknown trends and opportunities. Last week I attended COFES 2013 in sunny Arizona. The following buzzword caught my attention during one of the presentations. Here is a new buzzword – Data Exchaust.
I tried to find a better definition of what this term means. There is no consolidate view about that. The I found the best explanation about what is Data Exhaust on IT Law Wiki. Navigate your browser here. It provides four different definitions. The following one resonated the most with my way to think about data exhaust:
The "aggregation of [consumer] data through the digitization of processes and activities" in the commercial sector which generates metadata supporting corporate profit generation.
Here is a picture I captured during COFES 2013 presentation. It shows the idea of data exposed out as a result of mobile device usage.
Data exhaust is tightly connected to some notions of big data. Another interesting article I captured was a publication from O’Reily Strata website. Navigate to the following link to read the article – Tertiary data: Big data’s hidden layer. The article is worth reading. We are producing lots of data these days and this data can be very valuable. Unfortunately, we are far behind in our ability to capture the data we are producing and getting a value of this. Here is an interesting passage:
Back in the days of floppy disks, the lines of ownership were pretty clear. If you had the disk, the data was yours. If someone else had it, it was theirs. Things these days are much blurrier. That tertiary data — data that’s generated about us but not by us — doesn’t just build up on your mobile devices of course. Other people are building datasets about our patterns of movement, buying decisions, credit worthiness and other things. The ability to compile these sorts of datasets left the realm of major governments with the invention of the computer. We’re all aware of this, and there’s even a provocative buzzword to describe it: data exhaust. It’s the data we leave behind us, rather than carry with us.
I captured the following picture from the same article. It shows a visualization of iPhone location tracker.
Data exhaust conversation made me think about Product Lifecycle Exhaust. In everything PLM does today, we are very focused on how we create data during the engineering and manufacturing stage. PLM products provide little to none attention to the information products produce during their lifecycle. The situation is better for long lifecycle articles like airplane and nuclear submarines. But this is where PLM attention to lifecycle information ends.
What is my conclusion? Cost and quality are two top priorities of every manufacturers. In my view, data exhaust can be an interesting source of information about how to improve quality and reduce cost. We can learn about usage experience of our products, we can discover what features are not used by customers and we can learn how to optimize products in order to serve our customers in a better way. Just my thoughts. Do you see it the same way? Speak up. I want to know your opinion. If it resonates, come with examples, please.
Best, Oleg

Posted by olegshilovitsky
Data is a trending topic these days. Big Data is even fascinating. It made me think about the meaning of power. In the past, oil was a meaning for power. These days it applies to data. Social data, corporate data, any data. To have the ability to dig into the data, discover facts, relationships and make decisions spins minds of companies, technologists and investors. All I mentioned above applies to manufacturing companies and the data that these companies holds on their servers, data centers and desktop computers.
I want to talk about hardware today. You probably surprised, but I hope not so much. During the last 10-15 years, the majority of works PDM/PLM systems were doing were focused on the commodity low end x-86 servers. There is nothing wrong with that. Nevertheless, I can see some new trends coming in this space. It comes with web development, large data scale, mobile, data analytic and more. I can clearly see two patterns in how vendors are using hardware. One of them is an attempt to build proprietary data centers from commodity level servers (eg. Google, etc.). Another one is to focus on how to delivery solutions bundled with specific highly profiled hardware platforms (IBM Pure Data, Oracle Exadata, Cisco, etc.). Data centers are an ideal place for such type of boxes.
The issues of data, data lock-in, interoperability usually drives lots of debates and discussions. Started early from support and conversions of CAD data formats, interoperability continued to be complicated topic for PDM and PLM systems. Companies are still investing lots of money and effort in converting and translation of data. Introduction of SaaS and cloud platforms injected new waves of discussions – what happens with our data on the cloud. What if cloud software vendors lock my data, and I will not be able to get it out? What if a cloud vendor goes out of business, and data disappears. These are all very important questions.
I think the agreement about importance of the data model among all implementers of PDM / PLM is almost absolute. Data drives everything PDM / PLM system is doing. Therefore, to define the data model is the first step in many implementations. It sounds as something simple. However, there is implied complexity. In most cases, you will be limited by the data model capabilities of PLM system you have. This is a time, I want to take you back in history.
I’m in a deep technological mood these days. As you probably noticed, I’m attending 


Manufacturing companies aggregating a lot of data these days. Data is coming from many places. For many years, product development, manufacturing and supply chain was major sources of data in companies. Nowadays, data is coming from outside of a company. Internet, social network and communication created new source of information. The intersection of data from inside of a company and outside data is a very interesting place. I’ve been reading Forester blog – 





