Catalyst - Background Knowledge

The Catalyst documentation presupposes certain background experience; if you have never had cause to acquire that experience, learning Catalyst can seem overwhelming. This page tries to give you enough of the background that you will be able to work through your initial encounters with Catalyst; it is not an exhaustive or authoritative source of information on any of the following topics.

Mixing controller code, presentation code, data access code and business logic code in your software can lead to a lot of interdependencies, thus changing one part of your software can have side effects on many other parts of your application and can complicate maintenance a lot. Furthermore, because of all the coupling, using your code in other projects becomes difficult or even impossible. The MVC Pattern strives to provide a solution in order to encourage software reuse and ease maintenance concerns.

The MVC Pattern

MVC, or "Model-View-Controller", is an object-oriented pattern for separating the various concerns of an application. Traditionally, these concerns are: accepting and processing input, processing information, and displaying information. There are two separate schools of thought for exactly how it should work, but both schools agree on the following:

  1. There are three main objects involved: the Model, the View, and the Controller.
  2. The Model object handles the connection to the backing store (usually a database), fetching and storing data as needed.
  3. The View handles presenting information to the user.
  4. The Controller handles request processing, including authentication and access control.
  5. The business logic (that is, the code that is specific to your application) is not in the View object.
  6. Model does not do any web stuff
  7. You need to separate web stuff and other components reusable in cron jobs etc

Here is where the two schools diverge. The older school (popularized by the Smalltalk language) holds that the business logic should go in the Model. The younger school (popularized by recent web applications) holds that the business logic should go in the Controller. Both approaches work, and each has pros and cons. Which one you choose is more a matter of personal mindset and preference than anything else: does it suit your thinking better to have a smart Controller managing a thin database connection (the Model), or a smart Model that is tracking the state of your application and telling a thin request handler (the Controller) how to process requests?

A few further notes: in the context of a web application, the job of the View--"presenting information to the user"--usually means populating some sort of page template with relevant information and then sending the result to the user's browser.

In the context of a web application, the above-described job of the Controller--"handl[ing] request processing, including authentication and access control"--means taking the HTTP request from the webserver (request processing), verifying the user's identity (authentication) and the user's right to view a page or perform an action (access control), and then instructing whichever object contains the business logic to handle the request.

More on this subject in the catalyst mailing list archives at: http://lists.rawmode.org/pipermail/catalyst/2005-August/thread.html#1148 or at http://lists.scsys.co.uk/pipermail/catalyst/2005-August/001148.html

Another nice introduction (and reasons why Catalyst doesn't take separation too seriously) can be found at: http://www.andywardley.com/computers/web/mvc.html

A Very Brief History of Web Technology

(Note: This section is heavily oversimplified, perhaps to the point of error.)

In the beginning is the CGI (Common Gateway Interface). CGI is a specification detailing how HTTP requests can pass information into, and get information out of, programs.

CGIs were a great technology, but they suffered from two drawbacks: they were slow, and they required the author to do everything him/herself. By now, the speed problem has been pretty much solved (chiefly by mod_perl; alternatives include FastCGI and SpeedyCGI). More and more technology was invented to address the second problem: SSI (Server Side Includes, a primitive form of templating in which the server sticks various chunks of text together without the program needing to be involved), various modules for handling authentication, session tracking, etc (see the entire Apache::* namespace on CPAN), and so on.

After that came full-up templating systems, which offered loops, conditionals, and other tools that made for more powerful content-generation than the simple SSI of yesteryear. The most recent step has been frameworks: toolkits that would handle all the generic tasks (authentication, access control, session tracking, logging, etc) for you and leave you to just write the business logic with an absolute minimum of overhead. Catalyst is a framework.

Managing The Apache Webserver

The Apache webserver is one of the poster-children of the Open Source movement. It runs something over 60% of all the webservers on the Internet, totally dominating all competition. The primary reference site is http://www.apache.org.

There are two main branches of Apache, the 1.3 branch and the 2.0 branch. 1.3 is still being maintained, but further development is not being done. 2.0 is the active branch; it has several significant advantages over 1.3, including being multithreaded.

The Apache binary is called httpd. The configuration file which controls how it runs is httpd.conf. Assuming that httpd is in your path, you can type 'httpd -V' to have the server tell you all the critical information you need: what version your binary is (e.g., 1.3.33), where the httpd.conf is, where the error logs are, and what options were compiled into your binary.

TODO:

  • Add brief section on httpd.conf directives
  • Add pointers to example httpd.conf files
  • Add brief discussion of security issues

The Template Toolkit

At base, a templating system is very simple: you feed it some text that contains marked sections (e.g. variable names) and it substitutes those marked sections for the appropriate values. By this definition, the simplest templating system is: eval $string

There are literally dozens of Perl templating systems in existence. The dominant mind share seems to be divided between Template Toolkit, Mason, HTML::Template, and perhaps one or two others. Catalyst can be used with any templating system, of course (there are plugins for the three mentioned above), but many of the docs give examples for the Template Toolkit (TT for short).

In TT, a template is a text file that has been marked up with template directives. Directives are (by default) enclosed in special markers, like so:

[% put your directives here %]

Here is a simple Perl script that uses TT:

use strict;
use Template;

my $file = 'example.tt';
my $vars = {
fruit => 'apple',
colors => [ qw(red yellow green blue) ]
};

my $template = Template->new();

$template->process($file, $vars)
|| die "Template process failed: ", $template->error(), "\n";

And here is example.tt, which demonstrates some of the more commonly used directives:

[% fruit %]   # Will be replaced with 'apple'

#  The following generates an HTML droplist, populated with the
#  options:
#      red 
#      yellow
#      green
#      blue
#
<select>
[% FOREACH c IN colors %] 
    <option>[% c %]</option> 
[% END %] 
</select>

#  The following will generate the text 'Apples are red'
#
[% IF fruit == 'apple' %]
   Apples are red 
[% END %]

#  ELSIF & ELSE support are both available
[% IF today == "friday" %]
   Yay! It's Friday!
[% ELSIF today == "monday" %]
   Yuck.  It's Monday.
[% ELSE %]
   It's a normal day.
[% END %]

#  Here are the comparison operators: 
==          Test for equality
!=          Test for inequality
<           Less than
<=          Less than or equal to
>           Greater than       
>=          Greater than or equal to
&&, AND     grouping
||, OR      grouping
!, NOT      negation

Operators and expressions are for the most part borrowed directly from Perl, with some mostly convenient exceptions. You may use == and != for both numbers and strings. The subscript operators are replaced with just '.', for example in colors.2 and color_value.yellow (where 'yellow' is a key in the hash color_value). Also the string concatenation operator is '_', borrowing from Perl 6. (Although it's now '~' in Perl 6.)

On Database Design

The whole point of having a database is to -model- your data.

If you try and turn it into a giant hash, then of course you're going to end up with nasty code.

The reality is that you should have actual columns for things and update your database as required as new types of data need to be included - you'll have to update the application anyway, so there isn't any reason not to update the database at the same time.

A general rule of thumb is that you should be conceptualizing your databases similar to how you conceptualize your applications.

Your database schema, such as what tables you have, and their columns, and their column data types, and the relationships between tables and columns etc, these are like program code, such as how you choose to decompose your application into libraries and classes and class attributes and type constraints and input constraints and so on. The actual data you put in your database tables is analogous to what data you put in your application variables or objects.

Generally speaking it should be natural to change your actual database schema as often as you change your application source code, where it makes sense; for example, changing your schema is a similar sort of operation to changing what attributes your object classes have or your constraints.

Or more accurately in practice, a database is more like (or in some cases, exactly like) a shared library, where you have some classes you write once and share in multiple applications, and if you change the library you have to consider that impact on all the applications that use it. Hence people tend to be more conservative in database design changes, but still one shouldn't be afraid to do it, and all you really need is just proper communication and planning between the involved parties so it goes smoothly.

Also, same as classes can have multiple APIs, eg keeping old ones for backwards compatibility if old apps can't update, databases have things called views / virtual tables which let them also have multiple APIs; this is one of the main purposes of views in fact.

Links

The examples presented in the Catalyst docs will make more sense if you're familiar with the Perl modules they use. Excellent documentation exists for many of them. Here are links for some modules commonly used with Catalyst:

My tags:
 
Popular tags:
 
Powered by Catalyst
Powered by MojoMojo Hosted by Shadowcat - Managed by Nordaaker