What PLM Architects and Developers Need to Know about NoSQL?

July 7, 2014


People keep asking me questions about NoSQL. The buzzword "NoSQL" isn’t new. However, I found it still confusing, especially for developers mostly focusing on enterprise and business applications. For the last decade, database technology went from single decision to much higher level of diversity. Back in 1990s, the decision of PDM/PLM developers was more or less like following – "If something looks like document, use Excel and Office. Otherwise, use RDBMS". Not anymore. My quick summary of NoSQL was here – What PLM vendors need to know about NoSQL databases. You can go more deep in my presentation – PLM and Data Management in 21st century. If you feel more "geeky", and considering maybe summer development projects, I can recommend you the following book – 7 Database in 7 weeks.

John De Goes blog post The Rise (and Fall?) of NoSQL made me think how to explain the need of NoSQL for PLM implementers, architects and developers. In a nutshell, here is the way I’d explain that – NoSQL databases allow you to save variety of specific data in a much simple way, compared to SQL structured information. So, use right tool for the right job – key/value; document; graph, etc.

So, NoSQL is accelerating development of cloud and mobile apps. It became much faster since some specific NoSQL databases tuned for particular type of non-structured data:

With NoSQL: (1) Developers can stuff any kind of data into their database, not just flat, uniform, tabular data. When building apps, most developers actually use objects, which have nesting and allow non-uniform structure, and which can be stored natively in NoSQL databases. NoSQL databases fit the data model that developers already use to build applications. (2) Developers don’t have to spend months building a rigid data model that has to be carefully thought through, revised at massive cost, and deployed and maintained by a separate database team within ops.

However, everything comes with price. The important insight of the article is to point on how data can be reused for reporting and other purposes. The following passage summarizes the most visible part of what is missing in NoSQL:

It’s quite simple: analytics tooling for NoSQL databases is almost non-existent. Apps stuff a lot of data into these databases, but legacy analytics tooling based on relational technology can’t make any sense of it (because it’s not uniform, tabular data). So what usually happens is that companies extract, transform, normalize, and flatten their NoSQL data into an RDBMS, where they can slice and dice data and build reports.

PDM and PLM products are evolving these days from early stage of handling "records of metadata" about files towards something much more complicated – large amount of data, unstructured information, video, media, processes, mobile platforms, analytics. CAD/PLM vendors are pushing towards even more complicated cloud deployment. The last one is even more interesting. The need to rely on customer RDBMS and IT alignment is getting lest restrictive. So, the opportunity to choose right database technology (aka the right tool for a job) is getting more interesting.

What is my conclusion? Database technologies universe is much more complicated compared to what we had 10-15 years ago. You need to dig inside into data management needs, choose right technology or tool to be efficient. One size doesn’t fit all. If you want to develop an efficient application, you will find yourself using multiple data management technologies to handle data efficiently. Just my thoughts…

Best, Oleg

Tech Soft 3D TechTalk: PLM and Data Management in 21st Century

October 25, 2013


Boston is one of the rare places where you meet many CAD and PLM people at the same time at the same place. You don’t need to guess a lot why so. MIT CAD Lab as well as many companies in this domain made Greater Boston a unique place for talents in CAD and PLM space.

Tech Soft 3D is well known technological outfit helping many companies in CAD and PLM domain to develop successful products. Besides that Tech Soft 3D is sponsoring a gathering of technological fellows in the CAD/PLM domain to come, network and share their experience – Tech Talk. Yesterday was my first time attending Tech Talk in downtown Boston. I missed one last year because of crazy travel schedule. This year I’ve been honored to get invited and make a short speak. I shared my experience and thoughts about database and data management technological trends. As part of my presentation I shared my thoughts about so called NoSQL trend, what it contains and how it can be useful for CAD, PDM/PLM. Below you can see a full slide deck of my presentation.

PLM and Data Management in 21st Century from Oleg Shilovitsky

On the following slide, you can see a simplified decision table that can help you to designate what noSQL databases can be useful for different type of solutions.


What is my conclusion? Database and data management technology is going through cambrian explosion of different options and flavors. It is a result of massive amount of development coming from open source, web and other places. Database is moving from “solution” into “toolbox” status. Single database (mostly RDBMS) is no longer a straightforward decision for all your development tasks. My hunch, CAD/PLM developers need to ramp up with with tools and knowledge to tackle with future database decisions. Just my thoughts…
Best, Oleg

Why New Database Technology Won’t Solve PLM Problems?

March 18, 2013

Disruptive technologies and solutions. This is beloved topic by bloggers, analysts and vendors. We like to talk about how disruption will happen. For very long period of time, database technology was a stable element in the overall PDM/PLM technological architecture. The decision about usage of relational database was only about how to choose between Microsoft SQL, Oracle DB and IBM. However, recently, I started to hear voices about changes that might happen in the world of databases. I’m sure you’ve heard about noSQL. If not, you can catch up via this link – What PLM vendors need to know about noSQL databases? The question about the ability of noSQL or other alternative to SQL solution to disrupt PLM development and products is an interesting one. Yoann Maingon of Minerva is asking this question in his blog post – New database system for future of MDM. Yoann declares database as one of the inhibiting factors preventing users to move from one PLM solutions to another. Here is an interesting passage –

I hope the actual economy environnement will convince more people to invest time for helping the product development initiatives by solving Product Lifecycle Management related pending issues. Today’s article is about my thought on why it is always so complicated to move from a PLM solution to another, why are we changing tons of things and settings to in fact change a software cost or an interface. On of the reason is that all these elements are strictly related and the elements in a PLM solution can’t be dissociate. You can’t just keep you data and change the interface. From my previous experiences of migration, I’ve realized that I was moving data from a system to another with transformations but the data would still mean the same thing it is just that the software would handle it differently. Then speaking of standards would there be a standard for data management in PLM and should this define a technology or at least should it drive database system editors to push for one?

I’m in a bit disagreement with Yoann. In my view, all SQL, noSQL, XML and many other database flavors are providing an infrastructure that can be used in different ways. Yes, there are differences between noSQL and SQL databases. However, if we speak about building applications, the way application created will provide a much more significant factor on the efficiency and behavior of your application. Earlier this year, I’ve seen an interesting article, which focus on this specific question – NoSQL or Traditional Databases: Not Much Difference. Take you time and read it. You find it very educational. I absolutely liked the following passage –

I have spent considerable time tuning SQL statements and indexes, but in the end the best optimizations have always been those on the application and how the application uses the database. SQL Tuning almost always adds complexity and often is a workaround over bad application or data structure design. In the NoSQL world “SQL statement” tuning for the most part is a task of the past, but Data Structure Design has retained its importance! At the same time logic that traditionally resided in the database is now in the application layer, making application design even more important than before. So while some things have shifted, from an Application Performance Engineering Perspective I have to say: nothing really changed, it’s still about the application. Now more than ever!

What is my conclusion? No surprises. It is all about your application – logic, tuning, performance, mistakes and ability to create a reliable and efficient solution. Yes, different database techniques are requiring different technical skills. But if you have them, your ability to build PLM system on any of these platform will be almost identical. Just my thoughts…

Best, Oleg

What PLM vendors need to know about noSQL databases?

December 14, 2012

Relational databases is a very mature set of technologies. We use RDBM (Relational databases) practically everywhere these days. It is hard to imagine enterprise software and PDM/PLM systems these days without relational databases. At the same time, the new class of database management solution is coming. It called NoSQL (Not Only SQL). I posted about noSQL few times. You can refresh your memory by navigating to the following link. First time this term came in use back in 1998 as "noREL" databases. Later in 2009, the term noSQL was proposed for "to label the emergence of a growing number of non-relational, distributed data stores that often did not attempt to provide atomicity, consistency, isolation and durability guarantees that are key attributes of classic relational database systems". NoSQL database solutions are widely used today in web and mobile applications. I can see a growing number of noSQL database usage in business intelligence and master data management applications.

NoSQL is not a single database. This is a name for a broad set of data management or database technologies focusing outside of RDBMS world. The technologies and terminologies behind this term is new. PDM/PLM vendors ignored noSQL database management solutions until very recently. It made me think to provide a quick summary of what stands behind this broad term and what PDM/PLM uses cases it can support.

Key-value (KV) databases

KV stores is a simplest database model in noSQL world. It stores "keys" and associated "value". Basically your database is a storage of pairs of key-value. Some databases support more complex structure behind values such as complex values (list, hash), but it is not required. One of interesting PDM/PLM use cases is to store list of files as a key-value database. In such a case, file name is a key (including full path) and value is actually the content of the file. Examples of KV stores are Riak and Redis.

Colum-oriented databases

This type of database is very close to RDBMS. The main difference is that columnar data model designed to keep data from every column in the table together. It is an opposite solution to RDBMS, which keeps the data for a specific row together. It allows to add a column to a table in a very "inexpensive" way. Each row may have a different set of columns. This type of databases are good for reporting and business intelligence solutions. Columnar data model impacted few PDM/PLM core modeler development available today at the market, by providing a higher level of flexibility in data modeling. Example of column-oriented databases is HBase.

Document-oriented database

Document databases are managing data in a form of documents. Documents can be different and have different structure. The last thing makes document oriented databases very flexible. Some implementations of document oriented databases such as MongoDB provides you an ability to run query against the document structures as well as do mapreduce computations as well. Depends on the need you can consider different DO-databases. Examples of these databases are – MongoDB and CouchDB. You can consider document database in PDM/PLM in two cases – the need for high-performance scalable document store and free form data modeling.

Graph-databases and triple stores

Graph data model is dealing with highly interconnected data. It contains nodes and relationships between nodes. Both nodes and relationships can have properties (key-value pairs). This data model becomes really important when you are traversing through the nodes with a specific relationships. There are many situations in PDM/PLM applications when we need to traverse data efficiently. Graph database (and predecessors – object databases) has a great potential to bring a value here. The example of graph databases is Neo4j. Also, a specific case of graph databases is so-called triplestores managing information using triples (subject-predicate-object). Examples of triple stores are OWLIM and AllegroGraph. Also triple stores are supported by Oracle and IBM DB2

CAP Theorem and why PLM systems need to use more than one database?

In computer science CAP theorem states that it is impossible for a distributed computer system to simultaneously provide all there guarantee Consistency (all nodes see the same data at the same time), Availability (a guarantee that every request receives a response about whether it was successful or failed) and Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system). Navigate here to read more. It is a question of priorities and a tradeoff between what requirements you need to satisfy in your system. PLM systems are facing significant challenges in a variety of data types, retrieve patterns and data scaling. Usage of different strategies in database management can improve existing solutions.

What is my conclusion? PLM is a multidisciplinary approach. It handles variety of data and connected to many places in the organization. Design, engineering, manufacturing, supply chain, support, services. The specialty of PLM environment is to get connected to all data suppliers and interplay with different sources of data. From that standpoint, data behaves like oil – located in multiple places, but needs to be extracted. You need to use different tools to get it out. Think about different database as a tool-set to process and get access to data in a most efficient way. Just my thoughts…

Best, Oleg

Do We Need To Coach PLM Backbone with NoSQL?

December 10, 2010

I’d like to take my Beyond PLM conversation away from social and collaboration trend. Let’s think about bits and bytes of data, delivery technology and cloud. Navigate your browser to the following link – Microsoft Coaches NoSQL Option for Azure Cloud. I found this writing interesting. Microsoft play around NoSQL indicates that SQL Server may have some visible alternatives even inside Microsoft Azure Cloud.

What does it mean in PLM? Today’s products are heavy users of SQL option. DS V6 made usage of SQL backend to drive CATIA. Few days ago, I posted about Windchill technological trajectories. CIMData is saying Windchil has 900 SQL Tables and evolving forward. Reading Deelip yesterday writing about PTC from Shanghai PTC event, I found the following statement important:

(Q) Does PTC have any plans regarding cloud computing? (A) We view cloud computing as a delivery mechanism and we will take advantage of it when it solves some real problems. An example is Windchill and it does make use of the cloud today.

Does it mean PLM SQL technologies is a step ahead of MS SQL server? I don’t think so. I can think about potential SQL problems of PLM backbones on the cloud. PLM mindshare leaders will have a good partnership agreement with Microsoft. Does it mean Microsoft Azure NoSQL stuff will be immediately available for PLM platforms? How much re-writing it will be requiring from developers of existing PLM products?

So, what is my conclusion? It seems to me PLM R&Ds are experimenting with a lot with cloud PLM these days. Next year can be a year of some big introduction in this space. The cloud infrastructure is one big question mark. Some of PLM vendors are developing new products. Some other vendors (i.e. PTC) are considering cloud as just yet another delivery option. Time will show…

Best, Oleg

noSQL Use Case for PLM

September 15, 2010

I had a chance to read SQL vs. noSQL article in Linux Journal yesterday on the plane. I found it interesting and despite a bit up-level of a technical terms, beneficial for our PLM discussion. Navigate your browser on the following link and read this paper. The noSQL story is probably one of the most dramatic in the modern history and present of data management. It considered as a heretic in the early beginning. Now noSQL comes to the point when we can talk about real advantages and disadvantages of both usage – traditional SQL and noSQL databases. So, what is the noSQL story is about?

SQL and RDBMS predictability

The story of SQL databases is very tight connected to two definitions: RDBMS and ACID. ACID means Atomicity, Consistency, Isolation, Durability. RDBMS – Relational Database Management Systems. The story about RDBMS, SQL and ACID is a story about development of transactional systems. If you develop a financial system, you want your system be predictable. You cannot take a guess about what is going on in your financial records. The same is when you schedule your manufacturing shopfloor operations. ERP leveraged RDBMS systems heabvily in their history. A vast majority of ERP systems today based on SQL /RDBMS systems.

noSQL Case

The noSQL came to us from the internet examples of the past decade such as Google, Amazon S3 and others. The fact, modern internet kids are using it added additional flavor of importance. However, what is the real case behind? Instead of using relational tables and keys, the modern noSQL databases are using simple "key-value" stores. Each piece of data going to the database is given a key and key be easy retrieved back using the same key. This portion of simplicity provides a significant value. The step beyond key-value is to have "document stores" that can access documents according to the specific key values.

PLM Use Case

Product Lifecycle Management has traditional roots in SQL databases. Started as pure data management discipline, PDM and PLM systems came to compete with other enterprise systems. It was an obvious decision back in late 1980s and beginning of 1990s that to establish a full data control you need to manage your data using RDBMS. However, this decision was taken time ago. Since then, PLM developed lots of use cases. These use cases can bring an importance of predictability down and importance of flexibility and simplicity up.

What is my conclusion? Development of PDM and PLM systems is not a simple case. The complexity of systems is high. However, some fundamental decisions and architectures of PDM/PLM were laid about twenty years ago. The urgency of reducing complexity and flexibility in PLM architectures can raise a noSQL case for PLM. Just my opinion…

Best, Oleg

PLM Platforms: Retirement Or noSQL Knock-Out?

March 1, 2010

I found interesting that nobody speaks much about PLM platforms these days. It seems to me PLM vendors and service providers are focused on the more important issues, such as industry orientation, out-of-the-box functionality, SaaS and OnDemand or even by Open Source business models. However, what happens in the PLM-platform-department? Does everything is fine and well adjusted to the weather outside? Do we have enough power to move forward with all data we have these days on PLM platforms? Can we scale up in capacity? Can we support agile system development by customers? These and many other issues came to my head. However, I wanted to focus on two specific trends: Needs to manage data for the long term and noSQL trends in data management.

Long Term Product Data
This is not a very big secret. We produce more and more data on the daily basis. Product development and manufacturing companies are not exclusion from that. Bigger companies like aero-OEMs recognized this problem time ago. Their working procedures require the need to keep data for 50+ years as well as track information about each aircraft according to the serial number. Smaller manufacturers are just coming to this place. Additional weight of the regulations moves them even faster to the point where the amount of data will come to the not controlled level. There are two aspects of long term data retention in PLM – 1/3D and geometrical data; 2/non-geometrical and process-related information. I found the most interesting project in this area is prostep’s LOTAR. So, I’m looking on the progress of this activity. However, the timeline of LOTAR is seven years, which is probably okay, when we talk about 50-year data retention.

noSQL Trends
This is a not top secret. The really big guys are not running SQL these days – Google, Amazon, Facebook… All these companies developed their own data management facilities. However, despite coolness effect, the reason behind these initiatives is simple. The ugly truth is that our good friend uncle-SQL is coming to the middle-age. And even if you cannot hear voices about SQL retirement, the question about how our life can look like “after SQL” is very much acceptable. If you are not familiar with noSQL term, I’d recommend to take a look on this wikipedia article. Also, I found the following article – The noSQL movement, written by Mark Kellog on his blog as a very interesting research in this area.

PLM Platforms Data Foundation
All PDM/PLM platforms that available on the market today are relying on SQL database technology. There is no surprise – SQL is the mainstream technology in the enterprise. I can see two potential problems related to that: change management and data capacity. The first one, change management, seems as a very critical one. Customers are struggling with the level of flexibility PDM/PLM systems can provide. Solutions built on top of SQL data is sensitive to upgrades and data model changes. PLM vendors developed sophisticated systems how to manage it. However, the problem is still in place. The second one is data capacity. This problem is not uncovered in the full scope. I believe, with the future PLM implementations, there is a real chance to discover a scale-related problems.

What is my conclusion today? I think technology matters. Big boys developed alternative non-SQL data storage options. At the time when SQL-based relational database are power our PLM platforms, vendors need to think about what next. Some initial signs to think how to manage all company product lifecycle data for 50+ years are in place. There are visible interesting alternatives. However, they required future investigation by vendors.

Just my thoughts…
Best, Oleg



Get every new post delivered to your Inbox.

Join 290 other followers