Miguel Miguelez: Analyzing SDL Tridion CMS Data

Introduction

One of the challenges we have when evaluate an existing SDL Tridion implementation is the difficulty of having a high level overview of all the items involved in the system, and identify which relations between which item types can lead to identify existing or future problems.

This Article provides some information that can help in order to evaluate/audit an existing SDL Tridion implementation based on the interpretation of data extracted from the CMS Database.
This is applicable to several versions of the product.
Personally I have used this methodology for the versions 5.3, 2009, 2011 and 2013.

Also can be applicable to reporting and keep track of the evolution of the system.

Each Section explained here covers the interpretation of a data set that can be extracted with a SQL statement.
Depending of the implementation, design decisions, data volume, context, etc... those metrics can have different interpretation.

I won't share the SQL statements in this Blog, but I will give an overview of which tables are involved for extracting the Data.

Relation between Schemas
This area covers the relation between schemas in the system

The Tables involved in the data extraction are:

ITEM_REFERENCE_PUBLICATIONS
PUBLICATION_PRIORITIES
ITEMS
ITEM_REFERENCES
ITEM_ASSOCIATIONS
PUBLICATIONS

Sample of Data Extraction:

PUBLICATION	SCHEMANAME	SCHEMAUSEDNAME
010 Schemas	ARTICLE	PARAGRAPH
010 Schemas	ARTICLE	IMAGE
010 Schemas	ARTICLE	DOCUMENT
010 Schemas	ARTICLE	ARTICLE BODY

Sample of data analysis:

Schema Article is created in the publication 010 Schemas
Schema Article is referencing the schemas PARAGRAPH, IMAGE, DOCUMENT, ARTICLE BODY

The analysis of this data will give us information about:

General overview of how our content model is designed
Distribution of schemas across our Blueprint Model
Identify an abnormal number of relations beween schemas
Identify circular references (Schema A pointing to Schema B and Schema B pointing to Schema A)

Relation between Schemas and Categories
This area covers the relation between schemas and categories in the system

The Tables involved in the data extraction are:

ITEM_ORG_ITEM_ASSOCIATIONS
ITEMS
ORGANIZATIONAL_ITEMS
PUBLICATIONS

Sample of Data Extraction:

PUBLICATION	SCHEMA	CATEGORY
010 Schemas	ARTICLE	TOPICS
010 Schemas	NEWS	COUNTRY
010 Schemas	NEWS	STATE

Sample of data analysis:

Schema ARTICLE uses the Category TOPICS
Schema NEWS uses the Categories COUNTRY and STATE

The analysis of this data will give us information about:

Relation between Schemas and Categories
Identify an abnormal number of relations Schemas and Categories

Relation between Schemas and Component Templates
This area covers the relation between schemas and component templates in the system

The Tables involved in the data extraction are:

ITEM_REFERENCE_PUBLICATIONS
PUBLICATION_PRIORITIES
ITEMS
ITEM_REFERENCES
ITEM_ASSOCIATIONS
PUBLICATIONS

Sample of Data Extraction:

PUBLICATION	SCHEMA	COMPONENT TEMPLATE	ALLOWEDONPAGE	DYNAMIC
020 Design	ARTICLE	CT ARTICLE INTRO	1	0
020 Design	ARTICLE	CT ARTICLE FULL	1	0
020 Design	BANNER	CT BANNER	1	1

Sample of data analysis:

Schema ARTICLE has two component templates associated CT ARTICLE INTRO and CT ARTICLE FULL. Both component templates are static and can be used for assemble Articles in pages
Schema BANNER has one component templates associated CT BANNER. This component template is static and can be used for assemble Banners in pages

The analysis of this data will give us information about:

Distribution of component templates across our Blueprint Model
Relation between Schemas and Component Templates
If the current implementation has templates that you can't use when assembling components into a page (ALLOWEDONPAGE=0)
If the current implementation is using an Static (DYNAMIC=0), Dynamic (DYNAMIC=1) of Hybrid publishing Model
If a Schema is used in both Static and Dynamic Component Templates

Relation between Schemas and Components
This area covers the relation between schemas and components in the system

The Tables involved in the data extraction are:

ITEMS
PUBLICATIONS

Sample of Data Extraction:

PUBLICATION	SCHEMA	COMPONENTS
030 Content English	ARTICLE	100
030 Content English	BANNER	50
040 Content Spanish	ARTICLE	30

Sample of data analysis:

There are 100 components based on the schema ARTICLE in the publication 030 Content English
There are 50 components based on the schema BANNER in the publication 030 Content English
There are 30 components based on the schema ARTICLE in the publication 040 Content Spanish

Those 30 components can be localized or can be new components. This information can also be extracted if required

The analysis of this data will give us information about:

Distribution of components across our Blueprint Model
Volume of components
Usage of schemas when creating content

Relation between Pages and Page Templates
This area covers the relation between pages and page templates in the system

The Tables involved in the data extraction are:

ITEMS
PUBLICATIONS

Sample of Data Extraction:

PUBLICATION	PAGE TEMPLATE	PAGES
050 Web Site English	PT Main	70
050 Web Site English	PT Article	30
050 Web Site Spanish	PT Main	15
050 Web Site Spanish	PT Article	10

Sample of data analysis:

There are 70 pages based on the page template PT Main in the publication 050 Web Site English
There are 30 pages based on the page template PT Article in the publication 050 Web Site English
There are 15 pages based on the page template PT Main in the publication 050 Web Site Spanish

Those 15 pages can be localized or can be new pages. This information can also be extracted if required

There are 10 pages based on the page template PT Article in the publication 050 Web Site Spanish

Those 10 pages can be localized or can be new pages. This information can also be extracted if required

The analysis of this data will give us information about:

Distribution of pages across our Blueprint Model
Volume of pages
Usage of page templates

Relation between Pages and Components

This area covers the relation between pages and components in the system.

The Tables involved in the data extraction are:

PAGE_CONTENT
ITEMS
PUBLICATIONS

Sample of Data Extraction:

PUBLICATION	PAGE	CP's > 6
050 Web Site English	Article Index	9
050 Web Site English	Home	12

Sample of data analysis:

There are two pages with more than 6 component presentations assembled

Article Index uses 9
Home uses 12

The analysis of this data will give us information about:

Number of Components assembled in a page
Identify an abnormal number of Components added to a Page
Identify potential bottlenecks during publishing

Example: A page contains 30 Component Presentations using Static Component Templates

Relation between Components and Pages

This area covers the relation between pages and components in the system

The Tables involved in the data extraction are:

PAGE_CONTENT
ITEMS
PUBLICATIONS

Sample of Data Extraction:

PUBLICATION	COMPONENT	Pages > 4
050 Web Site English	BANNER MORTGAGE	12
050 Web Site English	QUICK LINKS	8

Sample of data analysis:

There are two components used in more than with more than 4 pages in the system

BANNER MORTGAGE used in 12 pages
QUICK LINKS used in 8 pages

The analysis of this data will give us information about:

Usage of components across pages
Identify the heavily utilized components
Identify potential bottlenecks during publishing

Example: A component is used in 15 pages assembled with a static component template

Some General Tips when Analyzing the Data

Evaluate the information from a global point of view

Is important when we evaluate an implementation we consider all the different data sets extracted.

Although each one can be interpreted separately, understanding the relation between those data sets provide a lot of useful information related to the system.

Be careful when you interpret the Data

Don't make quick assumptions related to an implementation. There are guidelines/best practices that any implementation should follow but sometimes the reality is different than the theory.

Ensure you collect and interpret the Data properly before making wrong assumptions. Also communicate with the people that has been working with the system for a while, they will give you very valuable information that can help in the interpretation.

Examples:

Example 1:

If we observe a bottleneck in the publisher when publishing a banner named "BANNER HOLIDAYS", based on the schema "BANNER", based on those metrics we could see that that component is used in 20 pages and the Component Template associated to the schema "BANNER" schema is static, so the republishing of that banner will trigger the republish of 20 pages.

Example 2:

If we observe that there is a reasonable number of Page Templates and Component Templates localized in the system across several publications, that means that any changes in Page Templates and Component Templates will take longer than if those were not Localized.

2 comments:

NivlongJanuary 3, 2014 at 12:32 PM
Thanks for sharing, Miguel! I find your last point is especially important.

As I come across more implementations I find that certain approaches may not seem "best practice" at first. But the approach fits and may even redefine what we consider best practice. For example, I told a customer that combined schema and template publications (and it fit their setup) to "not let anyone guilt you into changing your BluePrint."

I'm looking forward to your next posts!
VIKAS KUMAR (old blog, new: https://vikaskumarblogs.wordpress.com)April 21, 2016 at 11:15 AM
HI Miguel,

Wonderful Post ! This one is helping me in my 2013 to Web 8 upgrade.

Just a small typo correction I guess in the line below -
"Schema BANNER has one component templates associated CT BANNER. This component template is static and can be used for assemble Banners in pages "

Here - "This component template is DYNAMIC " . Cheers !

Thursday, January 2, 2014

Analyzing SDL Tridion CMS Data

2 comments: