Friday 20th of January 2017 08:55:52 PM

Module Rewrite - URL Rewriting Guide

  • Use DOM to directly manipulate the information stored in the document (which DOM turns into a tree of nodes). This document object is created by the DOM XML parser after it reads in the XML document. This option leads to messy and hard-to-understand code. Also, this works better for document-type data rather than just computer generated data (like data structures and objects used in your code).
  • Create your own Java object model that imports information from the XML document by using either SAX or DOM. This kind of object model only uses SAX or DOM to initialize itself with the information contained in the XML document(s). Once the parsing and initialization of your object model is completed, DOM or SAX isn't used anymore. You can use your own object model to accessed or modify your information without using SAX or DOM anymore. So you manipulate your information using your own objects, and rely on the SAX or DOM APIs to import the information from your ApplicationML file into memory (as a bunch of Java objects). You can think of this object model as an in-memory instance of the information that came was "serialized" in your XML document(s). Changes made to this object model are made persistent automatically, you have to deal with persistence issues (ie, write code to save your object model to a persistence layer as XML).
  • Create your own Java object model (adapter) that uses DOM to manipulate the information in your document object tree (that is created by the parser). This is slightly different from the 2nd option, because you are still using the DOM API to manipulate the document information as a tree of nodes, but you are just wrapping an application specific API around the DOM objects, so its easier for you to write the code. So your object model is an adapter on top of DOM (ie, it uses the adapter pattern). This application specific API uses DOM and actually accesses or modifies information by going to the tree of nodes. Changes made to the object model still have to be made persistence (if you want to save any changes). You are in essence creating a thin layer on top of the tree of nodes that the parser creates, where the tree of nodes is accessed or modified eventually depending on what methods you invoke on your object model.
  • Depending on which of the three options you use to access information using your Java classes, this information must at some point be saved back to a file (probably to the one from which it was read). When the user of your application invokes a File->Save action, the information in the application must be written out to an ApplicationML file. Now this information is stored in memory, either as a (DOM) tree of nodes, or in your own proprietary object model. Also note that most DOM XML parsers can generate XML code from DOM document objects (but its quite trivial to turn a tree of nodes into XML by writing the code to do it yourself). There are 2 basic ways to get this information back into an ApplicationML file:


    top
    Module Rewrite

    Welcome to mod_rewrite, voodoo of URL manipulation.

    This document describes how one can use Apache's mod_rewrite to solve typical URL based problems webmasters are usually confronted with in practice. The Apache module mod_rewrite is a module which provides a powerful way to do URL manipulations. With it you can nearly do all types of URL manipulations you ever dreamed about. The price you have to pay is to accept complexity, because mod_rewrite is not easy to understand and use for the beginner.

    NOTE: Depending on your server configuration it can be necessary to change the examples for your situation. Always try to understand what it really does before you use it. Bad use would lead to deadloops and will hang the server.

    The most example's can be used in the .htaccess file while other ones only in the Apache htppd.conf file.


    top
    RewriteCond

    The RewriteCond directive defines a rule condition. Preserve a RewriteRule with one or more RewriteCond directives. The following rewriting rule is only used if its pattern matches the current state of the URI and if these additional conditions apply too.

    You can set special flags for condition pattern by appending a third argument to the RewriteCond directive. Flags is a comma-separated list of the following flags:

    [NC] (No Case)
    This makes the condition pattern case insensitive, no difference between 'A-Z' and 'a-z'.

    [OR] (OR next condition)
    Used to combinate rule conditions with a OR.


    top
    RewriteRule

    The RewriteRule directive is the real rewriting.

    You can set special flags for condition pattern by appending a third argument to the RewriteCond directive. Flags is a comma-separated list of the following flags:

    [R] (force Redirect)
    Redirect the URL to a external redirection. Send the HTTP response, 302 (MOVED TEMPORARILY).

    [F] (force URL to be Forbidden)
    Forces the current URL to be forbidden. Send the HTTP response, 403 (FORBIDDEN).

    [G] (force URL to be Gone)
    Forces the current URL to be gone. Send the HTTP response, 410 (GONE).

    [L] (last rule)
    Forces the rewriting processing to stop here and don't apply any more rewriting rules.

    [P] (force proxy)
    This flag forces the current URL as a proxy request and put through the proxy module mod_proxy.


    top
    Regular expressions

    Some hints about the syntax of regular expressions:

    Text:
    . Any single character
    [chars] One  of chars
    [^chars] None of chars
    text1|text2 text1 or text2
    
    Quantifiers:
    ? 0 or 1 of the preceding text
    * 0 or N of the preceding text (N > 0)
    + 1 or N of the preceding text (N > 1)
    
    Grouping:
    (text) Grouping of text
    
    Anchors:
    ^ Start of line anchor
    $ End of line anchor
    
    Escaping:
    \ char escape that particular char
    

    top
    Condition pattern

    There are some special variants of CondPatterns. Instead of real regular expression strings you can also use one of the following:

    < Condition (is lower than Condition)
    Treats the Condition as a string and compares it to String. True if String is lower than Condition.

    > Condition (is greater than Condition)
    Treats the Condition as a string and compares it to String. True if String is greater than CondPattern.

    = Condition (is equal to Condition)
    Treats the Condition as a string and compares it to String. True if String is equal to CondPattern.

    -d (is directory)
    Treats the String as a pathname and tests if it exists and is a directory.

    -f (is regular file)
    Treats the String as a pathname and tests if it exists and is a regular file.

    -s (is regular file with size)
    Treats the String as a pathname and tests if it exists and is a regular file with size greater than zero.

    -l (is symbolic link)
    Treats the String as a pathname and tests if it exists and is a symbolic link.

    -F (is existing file via sub request)
    Checks if String is a valid file and accessible via all the server's currently configured access controls for that path. Use it with care because it decreases your servers performance!

    -U (is existing URL via sub request)
    Checks if String is a valid URL and accessible via all the server's currently configured access controls for that path. Use it with care because it decreases your servers performance!

    NOTE: You can prefix the pattern string with a '!' character (exclamation mark) to specify a non-matching pattern.


    top
    Protecting your images and files from linking

    DESCRIPTION: In some cases other webmasters are linking to your download files or using images, hosted on your server as inline-images on their pages.

    RewriteEngine On
    RewriteCond %{HTTP_REFERER} !^$ [NC]
    RewriteCond %{HTTP_REFERER} !^http://domain.com [NC]
    RewriteCond %{HTTP_REFERER} !^http://www.domain.com [NC]
    RewriteCond %{HTTP_REFERER} !^http://212.204.218.80 [NC]
    RewriteRule ^.*$ http://www.domain.com/ [R,L]
    

    EXPLAIN: In this case are the visitors redirect to http://www.domain.com/ if the hyperlink has not arrived from http://domain.com, http://www.domain.com or http://212.204.218.80.


    top
    Redirect visitor by domain name

    DESCRIPTION: In some cases the same web site is accessible by different addresses, like domain.com, www.domain.com, www.domain2.com and we want to redirect it to one address.

    RewriteEngine On
    RewriteCond %{HTTP_HOST} !^www.domain.com$ [NC]
    RewriteRule ^(.*)$ http://www.domain.com/$1 [R,L]
    

    EXPLAIN: In this case the requested URL http://domain.com/foo.html would redirected to the URL http://www.domain.com/foo.html.


    top
    Redirect domains to other directory
    RewriteEngine On
    RewriteCond %{HTTP_HOST} ^www.domain.com$
    RewriteCond %{REQUEST_URI} !^/HTML2/
    RewriteRule ^(.*)$ /HTML2/$1
    

    top
    Redirect visitor by user agent

    DESCRIPTION: For important top level pages it is sometimes necesarry to provide pages dependend on the browser. One has to provide a version for the latest Netscape, a version for the latest Internet Explorer, a version for the Lynx or old browsers and a average feature version for all others.

    # MS Internet Explorer - Mozilla v4
    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4(.*)MSIE
    RewriteRule ^index\.html$ /index.IE.html [L]
    # Netscape v6.+ - Mozilla v5
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5(.*)Gecko
    RewriteRule ^index\.html$ /index.NS5.html [L]
    # Lynx or Mozilla v1/2
    RewriteCond %{HTTP_USER_AGENT} ^Lynx/ [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla/[12]
    RewriteRule ^index\.html$ /index.20.html [L]
    # All other browsers
    RewriteRule ^index\.html$ /index.32.html [L]
    

    EXPLAIN: In this case we have to act on the HTTP header User-Agent. If the User-Agent begins with Mozilla/4 and is MS Internet Explorer (MSIE), the page index.html is rewritten to index.IE.html and the rewriting stops. If the User-Agent begins with Mozilla/5 and is Netscape (Gecko), the page index.html is rewritten to index.NS5.html. If the User-Agent begins with Lynx/ or Mozilla/1,2, the page index.html is rewritten to index.20.html. All other browsers receive page index.32.html


    
































<A HREF="http://www.udel.edu/FREC/spatlab/">
<IMG ALIGN=RIGHT HEIGHT=42 WIDTH=39 HSPACE=5 VSPACE=5 BORDER=0SRC="tinyearth.gif"></A>The University of Delaware SpatialAnalysis Lab is a proud sponsor of this website.  The animatedGIF image is also a link.  By default, a linked image has a 2-pixelwide blue border, but I suppressed that with the BORDER=0 attribute inthe IMG tag.
sticking mostly to shades of gray, but any color could be used. If you wanted an H1 with a red, green, blue, and yellow border, it's this easy:

As previously discussed, if no colors are defined, then the default color is the foreground color of the element. Thus, the following declaration will be displayed as shown in Figure 7-44:

P.shade1 {border-style: solid; border-width: thick; color: gray;}
P.shade2 {border-style: solid; border-width: thick; color: gray;

So far, we've only talked about what happens when you're using a visible border style such as solid or outset. Things start to get interesting, though, when the border style is set to be none:

P {margin: 5px; border-style: none; border-width: 20px;}

As we can see in Figure 7-41, despite the fact that the border's width was set to be 20px , when the style is set to none, not only does the SELECT lists, although user agents aren'tvery good about that sort of thing yet -- in fact, as of thiswriting, nobrowser will correctly place images in thebackgrounds of form elements.

WARNING

Just like background-color,background-image is not inherited -- in fact,none of the background properties are inherited. Remember also thatthey aren't ignored, then they're likely to cause strangeresults. Therefore, it might be a good idea to omit the universalselector in conjunction with class and ID selectors.

BODY > P {color: green;}<BODY><P>This paragraph is green.</P><DIV><P>This paragraph is not green.</P>