[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#677086: apache2-mpm-prefork: apache2 sends "400 bad request" on POST from some firefox browsers



I have collected some statistical data (by analysing the apache
logs) on how often this problem occurs on our webserver:
~38% of our users upload files by using a firefox browser
~6% of these firefox users get the error (400 bad request) when
uploading a file

bash-command:
cat access.log | egrep '"POST .+" 400 ' | cut -d ' ' -f 2 | sort -u
| wc -l

By using this command you see the amount of unique users having
that problem. And I think 6 out of 100 firefox-users are too much
users to ignore that problem.

Remember: When changing apache from 2.2.16 (squeeze) to 2.2.22
(wheezy) that problem does NOT occur anymore. So, what changes in
apache from 16 to 22 can be responsible for that strange behavior?

Nothing seems to be relevant besides the "send 408 instead of 400 on
timeout". Maybe firefox automatically retries on 408, but doesn't on
400?
In my opinion, the failure occurs much earlier. Firefox doesn't even start the upload of the file when 2.2.16 is used! (see wireshark captures later in this mail)

You could try:

- create a tcpdump of a large upload to 2.2.22 and compare to what's
going on with 2.2.16
Here are two captures from an upload of a 512K large testfile:
2.2.16: http://uploadtest.puzzleandplay.de/capture-a2-2-16-512K.cap
2.2.22: http://uploadtest.puzzleandplay.de/capture-a2-2-22-512K.cap

Here is a screenshot showing the comparison of frame 4 (LHS 2.2.22; RHS 2.2.16)
http://uploadtest.puzzleandplay.de/comparison_of_captures_16_vs_22.png

My interpretation:
==================
2.2.22:
 * Frame 1-3: TCP Three-Way-Handshake (OK)
* Frame 4: FF sends 1452 bytes to the server (maximum segment size MSS=1460) (OK)
 * Frame 5: server sends an ACK that Frame 4 was received successfully (OK)
* Frame >= 6: FF continuously sends data segments and then the connection ist closed as expected (OK)

2.2.16:
 * Frame 1-3: TCP Three-Way-Handshake (OK)
* Frame 4: FF sends ONLY 397 bytes (instead of 1452!) to the server (maximum segment size MSS=1460) AND! set the PSH-Flag! (I think NOT OK!) => WHY does FF behave this way? And why does FF behave different when using 2.2.22? The only information FF gets from the server until now is the SYN,ACK in Frame 2. And I can't see any differences to the SYN,ACK in Frame 2 of 2.2.22! * Frame 5: server (TCP) sends an ACK that Frame 4 was received successfully (OK) * Frame 6: apache (HTTP) sends "400 Bad Request" round about 20 seconds after the last frame. This should be OK because FF didn't send more header data for more than 20 seconds (reqtimeout.conf: RequestReadTimeout header=20-40,minrate=500) * Frame 7-10: server initiates the connection termination and the connection ist closed as expected (OK)

Having a closer look at the data included in frame 4 (2.2.16) shows that the data stopps EXACTLY after the Referer header field (including CR LF). In frame 4 (2.2.22) the header fields "Content-Type: multipart/form-data; boundary=---------------------------114782935826962" and "Content-Length: 524616" directly follow this header field. It seems as FF is stopping deliberately on exactly this position and therefore FF also sets the PSH flag to send that data immediately to the server (and not buffer the data until maximum segment size MSS is reached)!?

- Apply this patch to 2.2.16 and see if it helps:
http://svn.apache.org/viewvc?view=revision&revision=1100200
Applying the patch will only change the error code in frame 6 (2.2.16) from 400 to 408. IMHO the frame 4 (2.2.16) is already the problem.

From the dump you uploaded, increasing mod_reqtimeout's timeouts could
also help as a workaround.
I am not sure why increasing mod_reqtimeout's should work. Using 2.2.22 FF sends continuously data and using 2.2.16 FF didn't send anything for more than 20 seconds. I don't think that FF will start sending data after waiting for more than 20 seconds. But nevertheless I will try to get test results with higher timeouts configured.

But the > 20 second delay is clearly a problem on the client side,
because it happens before the server sends anything on that connection.
I agree with you in principle. But if apache really didn't send anything before on that connection then it must be impossible for FF to distinguish between 2.2.16, 2.2.22, nginx or IIS and therefore frame 4 should always look the same ;-)

Thanks a lot!
Thomas



Reply to: