Archive

Archive for July, 2007

Book Review: Alfresco by Munwar Shariff

July 27, 2007 doquent Leave a comment
Alfresco Book This is a review of the book – Alfresco: Enterprise Content Management Implementation by Munwar Shariff. Click on the cover image to see the publisher page for the book. The book is published by PACKT Publishing – an active supporter of open-source projects.

Disclaimers / Disclosures

  1. This review represents my independent opinion.
  2. No one has solicited this review. Specifically, I have not been paid by the author or Alfresco for this review.
  3. The publisher has provided me with a free copy of the book for writing this review.

Approach

This review does not attempt to provide information about the Alfresco platform outside the context of the book. The review focuses on what the book offers, what it does well, and where it could have done better. Content outline and information about the author is present on the publisher’s page for the book.

Review

As far as I am aware this is the only book available today about the Alfresco open source content management platform. The book has full backing of Alfresco, the company, and apparently has been written on their request. As such the author seemed to have full access to the information about the platform at the time of writing this book. The book addresses version 1.4 but most of the information applies to 2.0 as well.

I would describe this book in one sentence as a handy reference with illustrated step-by-step examples for performing various tasks with Alfresco.

The book starts with an overview of content management, open source, and Alfresco. Then it describes how to obtain and install Alfresco. Before going further, it shares some best practices related to planning. In general, I look for words of wisdom and advice in books which come from experience of the authors and add value above raw information which might be available from other sources as well. Therefore, I see this chapter as a good addition to the reference material present in the rest of the book. Then the book dives into the guts of Alfresco features – both standard features and customization options. It also addresses the needs of system administrators for maintaining the deployment such as export, import, backup, upgrades, and general maintenance. Finally, the book closes with a chapter on a sample content management solution which is also a very common need – a forms-based imaging solution for digitizing paper documents and managing the electronic documents in Alfresco.

I had been using Alfresco for 6 months before I started reading this book. I had also tried out Alfresco WCM 2.0 using the evaluation guide available with the software download. Therefore, most of the core concepts were a breeze to read through. I found the following information to be most interesting and educational beyond what I already knew – information about customization (all sorts of features that can be customized), business rules and workflows, information about maintaining the deployment, and the sample solution. The pointers to the API and model references are also useful since that’s what I would need if I were to customize Alfresco beyond the simple explamples.

Given the scope of the book most of the material is probably at an appropriate level of detail. However, I felt that certain changes could have made this book more useful for a wider audience. The first thing that jumped out at me was that the book is written pretty much in a “how-to with Alfresco” format. While that is great, it almost assumes a fundamental knowledge of content management concepts. It would have been great to have a generic (non Alfresco-specific) introduction to related content management concepts at the beginning of each chapter. This would have made it a great book for people who are first-time content management users. The second area of potential improvement is with the flow of topics. There were times when I read a section and wished there was more information in that section. Then a few sections later I would find a detailed section on the same information. This issue could have been addressed with a different organization of sections. However, more generally, I felt that the book could use some more text to glue the sections together and to make the transitions smoother.

Conclusion

This book is definitely a must-have for anyone implementing a solution using Alfresco. It is a handy reference for performing various tasks on the Alfresco platform. If you are new to content management in general then you may need additional resources to make effective use of the information available in this book.

Categories: Alfresco

Alfresco Feature Matrix

July 25, 2007 doquent Leave a comment

CMS Matrix provides information on what seems to be a pretty exhaustive list of CMS software products. Here is what they have to say about Alfresco.

Categories: Alfresco

Collaborating on Alfresco

July 25, 2007 doquent Leave a comment

From the perspective of frequency of changes, document storage can be divided broadly into two phases

  1. Teams collaborate to create, revise, and finalize, and publish documents
  2. Published documents are stored for long term and don’t change frequently

While features like versioning and security are good for document management, people may still resort to using email and file attachments with naming conventions for discussion around documents. Basic discussion capabilities are a must before a platform can claim support for collaboration.

Alfresco offers built-in support for discussion forums (forum spaces), topics, and replies on topics – core features available in any discussion forum product/framework. Further, a discussion thread can be associated with a document or a space. A specific set of users and groups can be invited to space to participate in discussions there. It is also possible to send email to all the users invited to a space.

Going back to my favorite question – how does Documentum support collaboration?  There are two levels and really they are two separate solutions. Documentum eRoom is a full-fledged collaboration solution. It has features such as calendars, discussion threads, databases, and dashboards. eRoom was a separate product prior to being acquired by Documentum. It is built upon Microsoft technologies and has a different architecture compared to the Documentum platform. Slowly, integrations between eRoom and the Documentum platform have become available. For example, eRoom can be a front-end while using a Documentum repository for content storage. Thus if you have a Documentum deployment on Linux or UNIX servers, you will need to bring some Microsoft technologies/servers into the mix if you want to use eRoom.

With the arrival of Documentum Collaborative Edition (DCE) some collaboration features can be directly enabled in the Content Server and Webtop (the basic Web interface to a Documentum repository). The collaborative features are also available through some other Content Server clients as well. DCE provides a level of collaboration support that is similar to what Alfresco offers. Either way, collaborative capabilities in Documentum require additional licenses.

To summarize – Alfresco offers basic collaboration capabilities without many bells and whistles. If you need richer collaboration features with Alfresco, you may have to deal with some integration. However, this approach appears to be consistent with the overall Alfresco approach of providing most used features first and keeping the platform lean and fast.

Categories: Alfresco

Powerful Aspects

July 24, 2007 doquent 2 comments

Aspects are not really a new concept though their popularity and utility have come to the fore with the recent availability of support in programming languages, tools, and frameworks. Aspects enable us to separate various (cross-cutting) concerns from the core ones. This a good design goal though a poor application of aspects can also undo any desired benefits. There are numerous useful resources related to this topic which can be located by searching for aspect or aspect-oriented programming (AOP). Here I want to share what aspects can do for a content management system like Alfresco.

A content item (typically a document) consists of two pieces of information – the content (file) and metadata (properties/attributes) that contains information about the document. For example, common metadata for a document includes title, subject, and authors. The properties included in the metadata are specified by a content type. In terms of object-oriented design, the content type is not much different from a class.

Similarly, content types often support an inheritance-hierarchy for defining new content types based on existing content types. There are two well-known issues along with the benefits of such a type-inheritance hierarchy:

  1. Single inheritance (one type can only have one parent type) limits the reuse of multiple existing types in a new type. Multiple inheritance has its own problems and is generally fallen out of favor.
  2. A document of a particular type can only have the properties mandated by the type.

Consider this simple example. Suppose there are certain documents that need to be published in some sense and require an effective date range in terms of an effective_date and an expiration_date. Further, these documents may be of arbitrary content types – the only constraint is that if they need to be published they should have these attributes. If we are using the type inheritance-hierarchy these two properties need to be present in common base type(s) for all possible document types that may have this need to be published. As a by-product these two properties will be present on all the documents of these types, no matter whether they are published or not.

The key issue here is separation of concerns. The effective date range is required for certain documents (irrespective of their types) depending upon their intended usage. Essentially, if a document needs to be published in some way and should be available only during the effective date range then those documents should have these two properties.

Aspects serve this purpose beautifully. An aspect can be thought of as a pseudo-object that can be slapped on to another object – effectively resulting in additional properties on selected objects. Aspects are not bound by the content type of the object. In Alfresco, the problem described above is solved with the built-in aspect called Effectivity. Just add Effectivity on specific documents or specify a rule on a space (folder) to apply this aspect to all the documents added (possibly recursively) to a space.

How does Documentum handle this challenge? In the current version (5.3), a_effective_date and a_expiration_date are properties on dm_sysobject type, which is a base type of several other types. As a result, these properties are present on a lot of objects even though they might not need them. Also the Object Reference Guide groups various attributes under labels such as X-related, Y-related, etc. some of which are not related to the core concerns and may be better served by aspects. However, the next major version (D6) plans to add support for aspects. Although this will facilitate management of new aspects, I am wondering if there are plans to refactor the existing organization of types. The dilemma is between leaving the bloat in place or to create a lot of change (due to reorganization with upgrade) with potential regression effects.

Documentum is hardly to be blamed in this regard since Alfresco, being a relative newcomer in the space, probably had the advantage of hindsight. On the other hand, one downside of being a newcomer is that Alfresco had to start with a 0% market share.

My concluding thought is one of caution. It is easy to make a golden hammer out of aspects. Suddenly, there is a choice between putting attributes on a content type or on an aspect. Usage of aspects will also carry some overhead of its own. Indiscriminate adoption of aspects may even lead to a maintenance nightmare. So measure twice and cut once – reap the best of strong typing as well as of aspects.

Categories: Documentum

Alfresco WCM 2.0 Review

July 23, 2007 doquent Leave a comment

Alfresco WCM 2.0 offers exciting features for rapid deployment of web content management solutions. It also facilitates management of business-level or application-level features on a web site. Content creation and approval is pleasantly smooth with sandbox previews for individual users.

Read my review of Alfresco WCM 2.0 highlighting the features mentioned above.

Categories: Alfresco

Smart Space, Hot Folder, and Workflows

July 20, 2007 doquent Leave a comment

Alfresco uses spaces for organizing content/document storage. My first impression was that space is just a fancy name for a folder. As I learned more the meaning began to sink in. While it is possible to just treat an Alfresco repository just like a folder tree on the disk, most of the useful capabilities can be utilized at a folder level. Further, when looking at the collaboration capabilities with a team working together in one area, space begins to make more sense.

As I learned about the rules that can be enabled at the folder level, which would apply to all documents stored (recursively) within a space, the neat folder-based workflow model became obvious. This is not a new concept by any means. If you search “hot folder” on Google you will find that people have been using this concept in wide variety of applications. In particular, AntFlow gives you this general-purpose capability to create arbitrary folder-based workflows (called simple workflows). What’s neat about Alfresco is that it offers this capability in an Enterprise Content Management (ECM) framework. An Alfresco repository can be used as a Windows share using the built-in CIFS interface. So you can just drag a file into such a folder share and have workflow and other rules such as rendition creation (PDF, for example) triggered automatically. There is a long list of possible actions that these rules can invoke – including automatic metadata extraction from MS Office documents and changing security for the document. Since this is a core capability, an end user can set up such a workflow as well as a developer.

Alfresco also supports fully-featured and task-oriented (as opposed to content or space-oriented) workflows. These are called advanced workflows and are implemented using jBPM (jBoss Business Process Management). These workflows can be designed using jBPM plugins for Eclipse.

Due to my background with Documentum, I cannot help but compare how similar things are achieved on the two platforms. Workflows in Documentum refer to the task-oriented workflows, similar to the advanced workflows in Alfresco. While folder-based workflows are possible in Documentum, they would typically require customization. In Documentum, a workflow template needs to designed, validated, and installed before a workflow instance can be created from that template by the end users. If you want a nicer workflow solution, you need to buy additional licenses for Business Process Manager (BPM). There are also separate per-user licenses for the Web-based interface (Webtop) and if you want collaboration capabilities in the interface, you need to buy additional licenses. That shouldn’t be a surprise though since commercial systems are designed to be sold with separable feature bundles.

Another nice capability you have in Alfresco is that you can define aspects and attach them to all the documents automatically beneath a particular space, irrespective of the type of document. Documentum plans to support aspects in their next major release – D6. Aspects offer exciting capabilities and will be discussed in a separate post.

Overall, I feel that the Alfresco approach to rules, actions, and workflows is very flexible and gives users a lot of freedom to get creative to meet their needs.

Categories: Alfresco

Why Alfresco?

July 18, 2007 doquent 2 comments

I have been building enterprise applications for what now seems to be ages. Technologies have come and gone, some became commodities others proved to be just fads. What is increasingly obvious today is that there is nothing like a global user base which can stabilize products rapidly as well as weed out the ones that don’t listen. As a result we have seen wonderful products take the center stage. Apache, Tomcat, Spring, Hibernate, Lucene, the LAMP (Linux, Apache, MySQL, PHP) platform – the list can go on. And now Alfresco appears on the scene, catering to a burgeoning demand for enterprise content management solutions while building on top of the already mature and widely-accepted open source technologies listed above. So what? It’s the feeling that I have come to associate with such open source products – “It just works!”

I have used various commercial enterprise products over the past 10 years or so. In recent years, I have developed expertise on EMC Documentum which is a high-end, expensive content management platform loaded with all the features you can think of; and then some. For various reasons, it is significantly complex as well. I believe that this complexity, more than any other thing, puts it out of the expertise limits of a large number of professionals. As a result there is an imbalance between the demand and supply of such expertise which drives the professional fees higher for Documentum professionals. Who am I to complain?

There is a huge existing market for Documentum and I don’t expect it to disappear any time soon. Accordingly there will be a need for such services and the professionals who can market their services successfully will thrive. Then I start thinking about the market that doesn’t buy Documentum for various reasons.

A cheapskate like me would test drive Alfresco in no time because it is free. And when I did I was in for a surprise. It was all setup in about 20 minutes and it just worked! I have probably installed Documentum about 50 times now and it was never so quick and I know now how to get it to work most of the time. This comparison is a bit unfair because an enterprise grade production configuration will probably take longer than 20 minutes for Alfresco. But a development environment setup for Documentum in 20 minutes? I would love to see that.

At this point, I am not going to compare Alfresco and Documentum much more. All I would say is that if you haven’t committed to a particular platform for enterprise content management and you are looking for one, give Alfresco a try.

I am hooked because it worked the first time I installed it and it has been working fine ever since. I can just drag a document to a shared folder path which is actually a folder in the repository – the document gets versioned, metadata is populated automatically for most document types, and it is indexed for full-text searches! If you want to see simplicity, robustness, and elegance in one place, give Alfresco a try!

Categories: Alfresco