The point is a web and Internet configuration that is portable across all Unixes (and possibly even Windows) servers, composed entirely or almost entirely of open-source software, whose initial software costs can be zero.
This does not means setting it up and maintaining the site will not cost a thing. It will. But there is evidence that such configurations are generally more robust, configurable and less costly to maintain than the equivalent commercial ones.
I think the letters are arranged in order of increased importance.
The OS itself is practically meaningless.
The web-server's function is usually very trivial
But the database and especially the scripting language should not be proprietary.
2. What can you serve?
The content that the client sees in his browser will affect his surfing experience.
Every technology that can be sent to the client can be used properly or abused.
As a general rule, simple sites that don't try to push the web-clients to their limits work better.
It is usually a good idea to verify that your web-site operates fine even with a minimalistic client like lynx. If not, you should probably make sure this is the case.
XHTML 1.0 Transitional/Strict - corresponds to HTML 4.01, but based on XML.
XHTML 1.1 - more or less XHTML 1.0 Strict, with a few things done differently.
HTML 4.01 is better supported by older browsers, but XHTML can help detect a lot of bugs.
The Transitional variants may be recommended, because the strict ones disallows many features that are better supported by older or buggy browsers.
2.2. CSS
Short for Cascading Style Sheets.
CSS enables modifying the visual behaviour of the HTML tags.
Allows sites that are friendlier to people with disabilities.
Facilitates having a more consistent look and feel within the site.
CSS Colours, Fonts, Etc.
A wise use of them can save you a lot of frustration.
Generally, changing the look and feel of the scroll-bars and other browser controls is a bad idea.
Specifying the font sizes in pixels will cause them to not be resizable in MSIE. (an MSIE bug).
Don't pick colours that are hard to distinguish.
Try not to make links unrecognisable.
Use each tag for its logical purpose. (the site should also look good with CSS disabled.)
2.2.1. CSS Layout and Positioning
This is an example for a floaty.
The W3C recommends using CSS2-based layout instead of the traditional table-based layout.
The problem is that while some things are much simpler with CSS 2. (for instance, the floaty on this page.), some simple table-based layouts are incredibly hard. (especially to get to work on all browsers)
CSS positioning is (IMHO) complicated. I was not able to get the complete hang of it. You can play with it, or try to consult the CSS-d mailing list, until it seems right. (just don't forget to test on all browsers).
2.3. Images
As a general rule your site should not contain large images, or too many of them, unless they are explicitly linked to as such. (they are slow to download with slow connectivity)
Make sure you keep a copy of the original loss-less images and resize them and compress them as you see fit.
All images must have "alt" attributes (or else blind and other disabled people cannot understand them).
Using images as links when text can do, is considered a bad idea (since the text will load much faster).
If you want to have a navigation image, include some textual links below it.
Animated images are generally bad if they appear at primary pages of the site. Make sure your ads, if you have any, are non-animated.
2.4. JavaScript
JavaScript (now standardised as ECMAScript) can be used to script the contents of pages, change their appearance, or control their various on-place widgets.
The implementation of JavaScript varies from browser to browser and sometimes has many bugs. Use with care.
A general rule of the thumb is that your site should not depend on JavaScript.
Do not use JavaScript for linking to another page, changing the appearance of a link on hover, generating HTML, or other things for which plain HTML would do.
Examples for good uses of JavaScript
Form Validation and Automation at the browser's side. (which does not preclude you from having sanity checks at the server).
Games. :-)
Navigation Menus. (again - the site should not depend on them).
More about it
I prepared a discussion of the pros and cons of using JavaScript in pages one writes.
On the other, frames are usually completely unnecessary.
Putting a navigation bar as a table cell, or a div inside the HTML is usually more desirable than putting it in a separate frame.
Use HTML target to open other documents in new windows.
However, don't hide the navigation controls (unless you want to save space), or do pop-ups of any kind.
They are highly annoying and can be overridden by most browsers so you should not depend on them anyhow.
2.6. Java (Client-Side)
Java applets are useful to display an interactive GUI for the site.
However, they should not be a replacement for plain HTML, nor should be loaded at the primary pages of the site.
Your site should be perfectly usable without them.
Good uses of Java:
Games. (as always)
Interactive Applications that will be very hard to implement in HTML and JavaScript.
2.7. Flash
Macromedia Flash is a popular technology for creating animated and interactive vector graphics presentations.
Your site should not depend on it. (some sites are a gigantic Flash page and that's not good.)
Flashy Flash ads are becoming increasingly more and more annoying.
Good uses for Flash:
Interactive or non-interactive Presentations, movies, Flicks.
As a replacement for Java.
To provide an alternative convenient interface to a site. (while still having a pure-HTML one)
Availability
The Flash client is distributed as a binary-only driver with a possibly problematic license.
The W3C Standard "SVG" (short for Scalable Vector Graphics) aims to be a non-proprietary alternative for Flash.
As of Mid-2004 it is under-implemented in most platforms.
2.8. Dynamic HTML
Dynamic HTML (or DHTML) is an HTML page whose content is changed at run-time.
The advent of web-standards and modern web-browsers, enables writing DHTML pages that are portable across many platforms.
Can be used to make the web-page feel more like a live application.
Can cause more potential problems than static HTML.
Generally, unneeded for most serious uses.
2.9. Media
MPEG films, MP3/Oggs, AU files
Make sure they are portable across platforms (i.e: not a Windows only format).
Make sure users can still make use of your site if they do not wish to download them from some reason.
3. What's on the server?
Static Content - a content that is read from the hard-disk and broadcast to the users in a non-altered form.
Server-side Generated Content - a content that invokes a callback on the server side, that spits out content to the user. (possibly based on user parameters like HTTP parameters, cookies, the URL path, etc.)
3.1. How much Generated Content do you Need?
How much of your site should be scriptable?
Company sites with various products, with changing the content very rarely, and at most a "Contact Us" handler, are better off left as static HTML.
A news site, search engine, online store, etc. require a site that should be generated by a suitable back-end.
It is tempting to make an entire site auto-generated just to have a common look and feel.
This wastes CPU, makes the site venerable for security breaches, etc.
Common Look and feel, a navigation bar, etc. can be achieved with pre-generated content as well.
3.2. Overview of Common Server-Side Scripting Technologies
The core Perl language is very extensive, and makes use of many programming elements, and allows for many different programming styles.
Beginning programmers may have trouble understanding the code of more experienced programmers, or write programs in what can be considered a sub-optimal manner.
This may or may not be an advantage.
3.2.2. PHP
PHP is a cross-platform scripting language intended specifically for the Web.
PHP was thought to be faster than Perl, but it is less powerful and flexible (no modules, no namespaces, objects proper only in PHP 5, no closures, limited anonymous functions capability), and sports an easy to learn core language.
(A recent study by Yahoo, however, claimed mod_perl had better performance than mod_php.)
PHP, as a language, has several potential security issues, which causes sloppy code to become vulnerable more easily.
Secure code can still be written, but it takes more discipline.
The PHP back-end itself has had a relatively poor security record, with many vulnerabilities discovered in it.
PHP has many configuration options, which affect its run-time behaviour.
This makes writing code that can be deployed on different systems with different configurations more difficult.
3.2.3. Python
Python is an interpreted language, that competes for the same niche as Perl and has similar capabilities.
However, it has a completely different philosophy with different syntax and conventions.
It has a "There's one way to do it" Philosophy which enables people to understand and modify each other's code. (contrast to Perl's "There's more than one way to do it.")
Uses whitespace for indentation, which makes embedding it inside text templates more difficult.
Somewhat slower than Perl in most cases, but still quite fast enough for most needs.
Ruby is a language that is supposed to combine the advantages of Perl, Python and Smalltalk. Much less widely used than Perl and Python (and PHP) at the moment, but looks promising.
Considerably slower than Perl.
3.2.4.2. OCaml
OCaml is a strongly-typed, object-oriented , functional language.
Compiles into native code, and so is very fast. (sometimes even more than gcc)
3.2.4.3. Tcl
Tcl is a language that used to compete for the same niche as Perl, with the advantage of having a GUI toolkit (Tk).
Has some serious limitations. (C-like NULL-terminated strings, strings as closures).
Not commonly used for web-scripting, except mostly as part of the OpenACS framework.
3.2.4.4. Java Server Pages (JSP)
Java Server Pages (or JSP) - server-side scripts that make use of Java Objects (Java Beans) to perform most of the complex operations.
Each .jsp script is translated to Java on the fly, and cached.
Being Java it is reportedly hard to set up in Apache and other web-servers.
Java has a non-open-source and problematic license.
3.2.4.5. Java Servelets
Java Servelets are Java objects that are compiled and stored persistently in the server side.
Quite problematic to write and maintain due to compulsory compilation stage and the inherent verbosity of Java, and its many required protocols.
3.2.4.6. CGI, FastCGI C/C++ Scripts
Write programs in C or C++ to do the job for you.
Faster than most interpreted solutions (where calculations performed by it are slow).
However, takes more time to compile, is prune to more errors, and is more verbose.
3.2.4.7. Web-server Extensions
Web-server extensions (such as Apache modules) are compiled native-code bindings, that are written in C/C++.
The fastest solution, but potentially insecure because it runs in the same process-space as the server.
3.2.4.8. Writing your own Web-Server
You may opt to write your own dedicated web-server for serving the requests.
Not recommended, because the standard may change, and you'll have to modify the code.
A possibility if the task at hand is simple enough, and/or you need the extra speed.
4. Content Management Systems
There are two types of frameworks called Content-Management Systems (or CMSes for short):
Such that are installed on the server (as a dynamic extension), are configured somehow and are used to manage and customize the site.
Such that are operated on the developer's local machine, generate a final static or partially dynamic site which is then uploaded as is to the server.
4.1. Examples of Server-Installed CMSes
Slash - Based on Perl, and various Perl technologies. GPLed.
Squishdot for Zope - similar to Slash only based on Zope.
PostNuke - an extensible web-log software written in PHP.
Working with such databases is more tedious than working with SQL databases. But they offer somewhat greater control.
5.3. Compatibility between Databases
The variants of SQL used by different database implementations are very different.
The SQL standards define a very low common denominator, and even most SQL functions have different names and semantics in different implementations.
If you wish to support more than one implementation, you'll probably need more than one codebase.
It is a good idea to think what features you would need (at present and in the future) and choose one database that supports all of them.
5.4. Do you need a Database?
Paul Graham wrote in his ViaWeb FAQ, that he did not use a database for ViaWeb (now Yahoo Store):
We didn't use one. We just stored everything in files. The Unix file system is pretty good at not losing your data, especially if you put the files on a Netapp.
It is a common mistake to think of Web-based apps as interfaces to databases. Desktop apps aren't just interfaces to databases; why should Web-based apps be any different? The hard part is not where you store the data, but what the software does.
I disagree with this statement.
Desktop apps don't always use databases, because they can afford to load the data on start-up, store it internally in memory using efficient data-structures, and then saving the data to disk serialized, at an Exit or a Save command.
Web-based programs, on the other hand, cannot afford to load and store the serialized data during every web-request done, and cannot maintain a state in memory because the HTTP protocol is stateless.
A database system allows them to manipulate efficient and complex data structures without having to constantly serialize and de-serialize them.
It is also probable that when working against the raw file-system, one will implement a lot of database functionality using the file-system primitives, which may be a lot of extra work.
Incomplete or completely lacking support for newer standards.
As a result, it tends to break under many pages.
Very small (and decreasing) use percentage, but can still be found in god-forsaken (mainly old) installations.
My advice: ignore completely. Supporting it is not worth the trouble. Your kilometrage may vary, however. (depending on the circumstances)
6.2. Internet Explorer 5.0, 5.5, 6.0
Available only for Windows (except for some really ancient Mac OS and Solaris/HP-UX versions)
Full of security bugs. (which concern its users, not the web-site developers)
Full of rendering bugs.
Still carries many proprietary extensions that are incompatible with the standards.
Not fully standards compliant. (many details of the standard are missing)
Lags behind other browsers in usability improvements.
"The new Netscape Navigator 4".
The most commonly used browser todate. (especially in Israel)
Can no longer be upgraded except as part of the OS, so its continued user-base will remain a problem, unless it diminishes.
Verdict
If you are running a site which appeals to a large (mainly not tech-savvy) user-base, you still need to support it.
By keeping your site clean of unnecessary problematic embellishments (which is a good idea anyway you look at it), you can best make sure it is compatible with this and other browsers.
You can look at my "Stop Using Internet Explorer" page for an alternative opinion about why I don't guarantee support for MSIE on my non-commercial sites. It has some useful links that explain its problems.
6.3. Mozilla
Cross-Platform (Windows 32, All UNIXes, Mac OS (Classic and X))
The most complete browser todate in its support for standards. (some things are still missing, but work is carried forth in this direction)
Has many registered bugs, but they are relatively rarely encountered. (better than most other browsers, at least)
Still relatively under-used in comparison to MSIE. (at least on Windows, which is the most common platform)
Its developers made a conscious decision, not to support the MSIE deviations and extensions from the W3C standards, and so it often breaks in sites that only work in MSIE.
Verdict
Make sure your site supports this browser.
It is available on your development platform of choice, and many people use it.
6.4. KHTML/Safari
KHTML is the HTML-rendering engine developed by the KDE development team.
Based on the Qt toolkit.
Open source, under LGPL.
Apple used it as a basis for their Safari browser, which is the default browser for Mac OS X.
Market share: a small percent of the UNIX workstations, and a relatively large percent of people using Mac OS X.
Made a policy of trying to support MSIE extensions to the standard (so MSIE-only sites will be compatible), as long as they don't contradict the W3C standards. (still many sites break.)
Historically, many bugs were present, but the situation will hopefully improve in the future.
Verdict
If your client workstation is Windows, you may have problems testing it. ( KDE may or may not be ported to cygwin and a Win32-based X-Windows server).
The problem is that many pages can break in it because of its many bugs.
I recommend testing in it, but it's not absolutely necessary.
6.5. Opera
A cross-platform Qt-based browser.
Aims to be very fast and lightweight.
Commercial. Distributed as binaries for most common platforms. (Ad-ware or pay)
Support for some W3C standards (or parts of them) is not present. They may or may not be added in the future.
Very small marketshare. (sort of falls between the chairs of OS-shipped browsers and the open-source Mozilla).
Verdict
Test with it, if you can (and you probably can). Other than that, it's not a very important browser. (at least not todate)
7.1. Why it is Important to Support non-MSIE Browsers
The "minority does not count" mantra that may persuade you to discriminate against non-Microsoft Internet Explorer users is not healthy.
You reject an entire (and growing) population who knows better than to choose IE.
You are also aware that your site is dysfunctional and non-standards-compliant.
This is bad in itself and may indicate that it would not function into the future. (you don't expect that everybody will use MSIE 6.0 forever).
7.2. Why it is Important to Keep Your Site Standards Compliant
By validating your pages you make sure the structure of the document is acceptable to standards-compliant browsers.
When you deviate from the standard, you don't know how the browser is going to react.
Saying that "I don't know if my site is standards compliant or not, but it looks good in all browsers." (or just in one browser and one platform on which it was tested) is not a good idea.
The browsers may not display this page correctly in future versions, or a new browser may do things differently.
(For instance, when MSIE 5.0 came, it was more standards-compliant than MSIE 4.x and as a result many sites that worked with IE before became broken. This happened in the release of MSIE 6.0, and in Windows XP Service Pack 2, as well.)
The browser has a tough job as it is, so you shouldn't make it tougher by sending it mal-formed input.
Caveats
Standards-compliance is not a panacea.
Even if all the output from the site is standards-compliant it doesn't mean it will display correctly.
This is either because of bugs in the browsers, or because you haven't done the right thing.
So, it is important to test the pages in as many browsers as possible, in addition to validating them.
7.3. Why it is Important to Keep Your Site Clean of Unnecessary Embellishments
One can write standards-compliant sites that are full of JavaScript games, Dynamic HTML, and other monsters like that. It is usually a bad idea.
Getting JavaScript code to work properly on all browsers is more difficult than getting a static HTML to do so. (or at least it can never be easier).
Most JavaScript code is unnecessary. It adds more gizmos to the site, but not more functionality.
By adding unnecessary embellishments to the site, you make them more prone to browser bugs, and mis-features; you make the site harder (and more costly) to maintain and you usually don't add much to the user experience to be worth it.
An extreme example of this are pages that use HTML 4.x markup without any CSS or other visual embellishments. If you do this, I guarantee you that your portability problems are over.
( I wouldn't recommend this extreme, because it will make your pages quite boring, but it still illustrates a point. )
7.4. Some Words of Wisdom
"I regret being so blunt but, no, we are not going to change our website in the near future. We have developed an amazing website, that costs us a lot of money to develop and maintain, and it is currently built exactly according to our business needs.
I can also say that although technologically it might be possible to develop a version for Mozilla or other browsers, Orange [Israel], and many other content providers, are investing in the most common browser, it is currently not cost efficient to develop versions for other browsers."
The GPL stands for the "GNU General Public License", and is a commonly used open-source license.
There are other open-source licenses that are similar to it.
You may have heard about its "viral" nature, that forces code that uses GPLed code to be open-source as well. Should you worry about it?
Usually there's nothing to be fear. The GPL (and all other free software licenses) explicitly allow making use (and even modifying) the software for internal use.
Software that operates web-sites is considered software for internal use, as it is not distributed to the outside.
Amazon.com as an Example
If we take Amazon.com for example, then they may make use of GPLed code for their web-site.
If, however, they decide to distribute the code as a framework that allows setting up similar sites (say Amazonware), then they'll have to either comply by the terms of the GPL, and distribute it under a compatible free software license, or alternatively eliminate the use of the GPLed software.
(Or if the option permits, get the copyright owner of the GPLed software to exempt them from the GPL somehow.)
Conclusion
This presentation was directed primarily at people who manage or wish to set up web-sites.
If you want to sell a framework that will facilitate setting up web-sites using LAMP, I wish you the best of luck, but you'll have to handle the legal problems involved in making use of open-source (and proprietary) software available on your own.
Generally note that the distribution terms of proprietary software may give you even more trouble, as far as basing a web-site on them is concerned.