Thursday, January 2, 2014

Analyzing SDL Tridion CMS Data

Introduction

One of the challenges we have when evaluate an existing SDL Tridion implementation is the difficulty of having a high level overview of all the items involved in the system, and identify which relations between which item types can lead to identify existing or future problems.

This Article provides some information that can help in order to evaluate/audit an existing SDL Tridion implementation based on the interpretation of data extracted from the CMS Database.
This is applicable to several versions of the product.
Personally I have used this methodology for the versions 5.3, 2009, 2011 and 2013.

Also can be applicable to reporting and keep track of the evolution of the system.  

Each Section explained here covers the interpretation of a data set that can be extracted with a SQL statement.
Depending of the implementation, design decisions, data volume, context, etc... those metrics can have different interpretation.

I won't share the SQL statements in this Blog, but I will give an overview of which tables are involved for extracting the Data.

Relation between Schemas
This area covers the relation between schemas in the system

The Tables involved in the data extraction are:

  • ITEM_REFERENCE_PUBLICATIONS
  • PUBLICATION_PRIORITIES
  • ITEMS
  • ITEM_REFERENCES
  • ITEM_ASSOCIATIONS
  • PUBLICATIONS

Sample of Data Extraction:

PUBLICATION SCHEMANAME SCHEMAUSEDNAME
010 Schemas ARTICLE PARAGRAPH
010 Schemas ARTICLE IMAGE
010 Schemas ARTICLE DOCUMENT    
010 Schemas ARTICLE ARTICLE BODY

Sample of data analysis:
  • Schema Article is created in the publication 010 Schemas
  • Schema Article is referencing the schemas PARAGRAPH, IMAGE, DOCUMENT, ARTICLE BODY


The analysis of this data will give us information about:

  • General overview of how our content model is designed
  • Distribution of schemas across our Blueprint Model  
  • Identify an abnormal number of relations beween schemas
  • Identify circular references (Schema A pointing to Schema B and Schema B pointing to Schema A)


Relation between Schemas and Categories
This area covers the relation between schemas and categories in the system

The Tables involved in the data extraction are:
  • ITEM_ORG_ITEM_ASSOCIATIONS
  • ITEMS
  • ORGANIZATIONAL_ITEMS
  • PUBLICATIONS


Sample of Data Extraction:


PUBLICATION     SCHEMA CATEGORY
010 Schemas   ARTICLE TOPICS
010 Schemas   NEWS COUNTRY
010 Schemas     NEWS STATE

Sample of data analysis:
  • Schema ARTICLE uses the Category TOPICS
  • Schema NEWS uses the Categories COUNTRY and STATE

The analysis of this data will give us information about:

  • Relation between Schemas and Categories
  • Identify an abnormal number of relations Schemas and Categories

Relation between Schemas and Component Templates
This area covers the relation between schemas and component templates in the system

The Tables involved in the data extraction are:
  • ITEM_REFERENCE_PUBLICATIONS     
  • PUBLICATION_PRIORITIES          
  • ITEMS                           
  • ITEM_REFERENCES                 
  • ITEM_ASSOCIATIONS                
  • PUBLICATIONS

Sample of Data Extraction:

PUBLICATION SCHEMA COMPONENT TEMPLATE    ALLOWEDONPAGE                                   DYNAMIC
020 Design ARTICLE CT ARTICLE INTRO 1 0
020 Design ARTICLE CT ARTICLE FULL 1 0
020 Design BANNER CT BANNER 1 1

Sample of data analysis:
  • Schema ARTICLE has two component templates associated CT ARTICLE INTRO and CT ARTICLE FULL. Both component templates are static and can be used for assemble Articles in pages  
  • Schema BANNER has one component templates associated CT BANNER. This component template is static and can be used for assemble Banners in pages 

The analysis of this data will give us information about: 
  • Distribution of component templates across our Blueprint Model 
  • Relation between Schemas and Component Templates   
  • If the current implementation has templates that you can't use when assembling components into a page  (ALLOWEDONPAGE=0)  
  • If the current implementation is using an Static (DYNAMIC=0), Dynamic (DYNAMIC=1) of Hybrid publishing Model 
  • If a Schema is used in both Static and Dynamic Component Templates

Relation between Schemas and Components
This area covers the relation between schemas and components in the system

The Tables involved in the data extraction are:

  • ITEMS 
  • PUBLICATIONS

Sample of Data Extraction:

PUBLICATION SCHEMA        COMPONENTS
030 Content English ARTICLE 100
030 Content English BANNER 50
040 Content Spanish ARTICLE 30

Sample of data analysis:
  • There are 100 components based on the schema ARTICLE in the publication 030 Content English  
  • There are 50 components based on the schema BANNER in the publication 030 Content English  
  • There are 30 components based on the schema ARTICLE in the publication 040 Content Spanish
    • Those 30 components can be localized or can be new components. This information can also be extracted if required  

The analysis of this data will give us information about: 
  • Distribution of components across our Blueprint Model 
  • Volume of components 
  • Usage of schemas when creating content  

Relation between Pages and Page Templates
This area covers the relation between pages and page templates in the system

The Tables involved in the data extraction are:

  • ITEMS
  • PUBLICATIONS 

Sample of Data Extraction:

PUBLICATION PAGE TEMPLATE                PAGES
050 Web Site English PT Main 70
050 Web Site English PT Article 30
050 Web Site Spanish PT Main 15
050 Web Site Spanish PT Article 10

Sample of data analysis:
  • There are 70 pages based on the page template PT Main in the publication 050 Web Site English  
  • There are 30 pages based on the page template PT Article in the publication 050 Web Site English  
  • There are 15 pages based on the page template PT Main in the publication 050 Web Site Spanish  
    • Those 15 pages can be localized or can be new pages. This information can also be extracted if required 
  • There are 10 pages based on the page template PT Article in the publication 050 Web Site Spanish  
    • Those 10 pages can be localized or can be new pages. This information can also be extracted if required  


The analysis of this data will give us information about: 
  • Distribution of pages across our Blueprint Model 
  • Volume of pages 
  • Usage of page templates  

Relation between Pages and Components
This area covers the relation between pages and components in the system.

The Tables involved in the data extraction are:
  • PAGE_CONTENT 
  • ITEMS
  • PUBLICATIONS
Sample of Data Extraction:

PUBLICATION PAGE               CP's > 6
050 Web Site English Article Index 9
050 Web Site English Home 12

Sample of data analysis:
  • There are two pages with more than 6 component presentations assembled 
    • Article Index uses 9
    • Home uses 12   

The analysis of this data will give us information about:
  • Number of Components assembled in a page  
  • Identify an abnormal number of Components added to a Page
  • Identify potential bottlenecks during publishing
    • Example: A page contains 30 Component Presentations using Static Component Templates 


Relation between Components and Pages
This area covers the relation between pages and components in the system

The Tables involved in the data extraction are:
  • PAGE_CONTENT 
  • ITEMS
  • PUBLICATIONS
Sample of Data Extraction:

PUBLICATIONCOMPONENT            Pages > 4
050 Web Site EnglishBANNER MORTGAGE12
050 Web Site EnglishQUICK LINKS8

Sample of data analysis:
  • There are two components used in more than with more than 4 pages in the system 
    • BANNER MORTGAGE used in 12 pages
    • QUICK LINKS used in 8 pages

The analysis of this data will give us information about: 
  • Usage of components across pages  
  • Identify the heavily utilized components
  • Identify potential bottlenecks during publishing
    • Example: A component is used in 15 pages assembled with a static component template 


Some General Tips when Analyzing the Data

Evaluate the information from a global point of view
Is important when we evaluate an implementation we consider all the different data sets extracted. 
Although each one can be interpreted separately, understanding the relation between those data sets provide a lot of useful information related to the system.

Be careful when you interpret the Data
Don't make quick assumptions related to an implementation. There are guidelines/best practices that any implementation should follow but sometimes the reality is different than the theory.
Ensure you collect and interpret the Data properly before making wrong assumptions. Also communicate with the people that has been working with the system for a while, they will give you very valuable information that can help in the interpretation.


Examples:

Example 1:
If we observe a bottleneck in the publisher when publishing a banner named "BANNER HOLIDAYS", based on the schema "BANNER", based on those metrics we could see that that component is used in 20 pages and the Component Template associated to the schema "BANNER" schema is static, so the republishing of that banner will trigger the republish of 20 pages.

Example 2:
If we observe that there is a reasonable number of Page Templates and  Component Templates localized in the system across several publications, that means that any changes in Page Templates and Component Templates will take longer than if those were not Localized.