Showing posts with label apache. Show all posts
Showing posts with label apache. Show all posts

Friday, April 11, 2014

Using RewriteRule instead of Alias in .htaccess

Subtitle: Why aren't Apache rewrites working?!?!?!?!

I had a few directories (e.g. each a mini website or webapp) sharing a fonts directory, and I was doing this with symlinks. But symlinks are a pain to rsync, so I thought let's use Apache to do the symlinking, i.e. use Alias. I put this in .htaccess:

  Alias /mysite/fonts/ ../shared/fonts/

500 Internal Server Error.

Or, in other words, Alias cannot be used in .htaccess, only in httpd.conf. I could've done it that way (added the Alias to httpd.conf) but as I was already using RewriteRule in the .htaccess file I decided to persevere finding a way to do it using just the .htaccess file.

This was my first attempt:

   RewriteRule ^/mysite/fonts/(.*)$ /shared/fonts/$1  [L]

Completely ignored. Now, when you start googling, or searching on StackOverflow, for "Apache rewrite does not work", 99% of the hits will tell you how to turn it on. I.e. you need AllowOverride All in your httpd.conf, which allows you to use .htaccess files, and you need Options FollowSymLinks so that the rewrite engine will work. And you need RewriteEngine On in your .htaccess file.


Unfortunately I'd done all those things. I knew the rewrite engine was working, as I had other rules in this same .htaccess file, and commenting them out changed behaviour. My problem was that my new rewrite rule was just being ignored, even though it looked correct.


I'll spare you the pain of seeing the 187 other experiments I tried. The forehead-slap moment came when I realized the first parameter of the rewrite rule is relative to the directory the .htaccess file is in. I.e. my .htaccess file is in the /mysite/ directory. And, thus, the correct rule turned out to be:

   RewriteRule ^fonts/(.*)$ /shared/fonts/$1  [L]

Note the lack of leading forward slash. Very important. Incidentally, the second parameter is relative to the server root if you start it with a forward slash! In other words, it is only the first parameter that doesn't work that way.

The [L] at the end of the RewriteRule means "last", and is normally what you want for this kind of rewrite rule. Use [R] ("redirect") instead if you wanted the URL to be rewritten (i.e. an HTTP redirect to be sent back). Useful troubleshooting tip: if you set the flag to [F] then you can immediately see if the first parameter to the rule is being recognized: if it is getting matched, you get a 403 (and if you don't then you have to fix the first parameter). If you get a 403 with an [F], but don't get the expected results when you change the the [F] to an [L] then it is the second parameter you need to fix.

So, to summarize, if your .htaccess file is in a subdirectory, remember that the RewriteRule has to be relative to that subdirectory.



Sunday, November 6, 2011

PHP, Proxies, HTTPS: v2, v3 or v23?!

From a PHP http client, using HTTPS via a proxy, I started getting a "400 bad request" error from Apache. I knew it could work because it worked last week. The apache error log message was:
   Hostname 127.0.0.1 provided via SNI and hostname mytest.local provided via HTTP are different

My first troubleshooting mistake was messing around with server-side settings: I found out what SNI meant, but as far as I could see I wasn't using it. Then, finally, I remembered I could use curl as a test http client, and it was working fine. I removed all my server-side changes and curl was still connecting fine. So now I knew I'd broken something client-side. I added the -v flag to curl to see the exact headers it is sending. We're both sending the same headers.

Finally I remembered I'd changed this line:
  stream_socket_enable_crypto($fp,true,STREAM_CRYPTO_METHOD_SSLv23_CLIENT);

The PHP docs give no guidance on which option to choose, but "v23" sounded like it would work with version 2 or version 3, and maybe do all kinds of auto-negotiation behind the scenes. Which had to be a good thing. I'm sure I'd tested after changing, but I must have tested non-HTTPS or without the proxy by mistake. When I changed back to this line, everything worked again:
  stream_socket_enable_crypto($fp,true,STREAM_CRYPTO_METHOD_SSLv3_CLIENT);

I hope that helps someone, as google was no use for me (all the hits about the SNI name and HTTP name difference were due to Apache being case-sensitive about the name comparison, which was not the problem here).

By the way if you want to know how to do http connections, with a proxy, using PHP, supporting both HTTP and HTTPS, I've described it here.

Saturday, September 17, 2011

Add a temporary static IP address

At home, with wired ethernet, my (Ubuntu) notebook has a few static IP addresses that I use for developing websites. Out of the house, I use wicd, so I have a dynamic IP address, and those static IPs don't exist. wicd configuration is too complex for me to understand, so I just accept this, but it caught me short the other day when I needed to have both an internet connection and to be able to work on a website running on my notebook.

I failed then, but I'm ready for next time. To temporarily add a static IP address you simply do (as root):
ifconfig eth0:3 10.10.10.10 netmask 255.255.0.0
I'm choosing "eth0:3" for the interface; it can be any unused number after the colon, and you never need to care what this is. netmask can really be anything for our purposes. The 10.10.10.10 is the IP address I've given it. Test with this:
ping 10.10.10.10
To set up a quick virtual host create a file under /etc/apache2/conf.d called 10.10.10.10.conf (any filename is fine) with these contents:
<virtualhost 10.10.10.10:80="">
    DocumentRoot "/var/www/somewhere"
    ServerName 10.10.10.10
</virtualhost> 
 
Tidyup
To remove just the interface that you added above, use this command:
ip addr del 10.10.10.10/32 dev eth0:3
Or, to restore the network to boot defaults (useful if you have done lots of changes) you can do:
ifdown -a
ifup -a
Either way to then remove the apache config: delete the 10.10.10.10.conf file you created and restart apache.

Monday, September 12, 2011

node.js: Good Tutorial

Chatting with a friend at the weekend, node.js came up; I'd only vaguely heard of it, but apparently it is what all the kewl kids are into. Server-side javascript, right up my street.

I took a very quick look yesterday, and it sounded interesting: especially for speed-critical websites. So, today I took a deeper look. First up you should know the official website has a documentation link that only goes to the API documentation. In real terms that means node.js is officially undocumented. Then from the Wiki I found a link to a free e-book called Mastering Node. I won't give the link as it is (currently) over-priced: poorly organized and unfinished.

I was  getting a bit demoralized when a search found http://www.nodebeginner.org/. Now this is more like it. In fact this is an excellent tutorial, that goes right from raw beginner to a reasonably complex mini-application. (I read it in HTML format, but as an ebook it is 60 pages, so you can get an idea of how involved it is going to get.)

I followed along, and it was fairly easy, though I did have a lot of trouble outputting the POST-ed data. It always said "undefined" in my browser. I could do a console.log() of the variable just before and just after writing it to the browser and it was set correctly. I cleared the cache repeatedly, and tried a second browser. Annoyingly I didn't solve that problem in the end.

The main point of this blog post is to recommend the above tutorial/ebook if you're interested in getting a feel for node.js, but What Do I Think About node.js?

Hhhmmm. First and foremost, it should only be used by expert programmers. It is a bit like C++ in that it is going to be easy to shoot yourself in the foot. Asynchronous programming is hard. But if you use a synchronous programming-style you will lock up the whole server. I'm thinking in terms of the web server example application here (which is the use-case I had in mind for it). Asynchronous programming is hard. Yes, I know I already said that but I don't think you thought through what that means in the Real World. What will happen is programmers will take a short-cut: they'll use little bits of synchronous code for jobs they know are so quick that no-one will notice. (Even the above tutorial does this: fs.renameSync)  The problem is any job involving any I/O device (like a hard disk file system or a TCP/IP socket) will take 10 times longer to finish than average, about 1% of the time. (I made that statistic up, but the principle is true, so stay with me...)

What does that mean? When that happens it will lock the whole web server up, and all the other requests will block. Take the extreme case of doing a sync action that results in a time-out because the resource has gone offline. If the time-out is 30 seconds, the whole web server is down for 30 seconds. All of your customers are getting time-outs. Every image comes up broken.

Another nice thing about the Node Beginner tutorial is the links to deeper information... and nested in one of the comments is a link to a paper comparing threads and events: Why Events Are A Bad Idea  Well, their conclusion is in the title, but if you look at their charts the important thing to learn is that well-written event handling code and well-written threading code are basically as quick as each other. (You won't ever reach the right-side of the charts in a real website on a single server; their example is just serving a static image. So the differences are only of interest to academics.)

But, although dealing with threads is hard, asynchronous programming is even harder, IMHO.

Now, if node.js had a web server module where it maintained a thread pool and each new request got its own thread, then I could program in a synchronous style in my thread, happy in the knowledge I won't break the web server, and also happy I'll be able to meet my deadlines...

Sunday, July 3, 2011

Apache: use both PHP module and PHP-CGI

I had a need the other day to configure apache so it uses the PHP Apache module for all directories except one, where I wanted it to use the cgi version of PHP. The apache module is more efficient, but runs as part of the apache process. I wanted one URL to run in its own process. (I was troubleshooting and had a hunch this might help; luckily it did in this case!) Instructions for both Linux and Windows follow.

On Ubuntu I needed some preparation: 1. install the php5-cgi package (different from php5-cli, which is what we use when using php from the commandline); 2. enable the actions module.
On Windows I was using xampp, which follows an Everything-But-The-Kitchen-Sink philosophy, and all I needed was already there.

Here is the code I added (to my VirtualHost). First for Ubuntu:
ScriptAlias /usephpcgi /usr/bin
Action  application/x-httpd-php-cgi  /usephpcgi/php5-cgi
And then for Windows XAMPP:
ScriptAlias /usephpcgi C:/xampp/php/
Action  application/x-httpd-php-cgi  /usephpcgi/php-cgi.exe
(That is copy and paste from working configurations, so I think the trailing slash must be optional!)

Then for both Apache and Windows:
AddType application/x-httpd-php-cgi .phpcgi
Now, I have to admit my dirty secret: I cheated. Instead of enabling php-cgi for all *.php in one directory, I left *.php going to the Apache PHP module, and I created the *.phpcgi extension to use the cgi binary. Initially this was simply because I managed to get it working that way; but on reflection I realized I preferred it: I can switch a script between using php module and php-cgi just by changing the extension; also I can use php-cgi anywhere in my VirtualHost. If that does not sound so useful I should explain the script in question is already hidden between an Alias, something like this, so no public-facing URLs need to change:
AliasMatch ^/admin/(.*?)/(.*)$ /path/to/admin.phpcgi

What about my original plan to configure it for a directory, without changing the file extensions? I had trouble with this, and gave up on this; but Ben kindly left a comment, so now I know how to do it. First, you still need the ScriptAlias and Action directives shown above. Then it is simply this.
<Directory /var/www/path/to>
    <FilesMatch "\.ph(p3?|tml)$">
        SetHandler application/x-httpd-php-cgi
    </FilesMatch>
</Directory>
As Ben explains (see comment#1 below) the reason we need the FilesMatch inside the Directory is because mod-php is setting a global FilesMatch directive; and that takes priority over our attempts to use AddType or AddHandler for a directory.