Notes |
(0000546)
monoclast (reporter)
2009-03-06 13:21
|
I would like to add that I think it should be perfectly acceptable to use punctuation characters such as "?" in user names.
The real issue, IMO, is that Geeklog isn't properly encoding the user name for inclusion in a URL when it names the image file. |
|
(0000547)
monoclast (reporter)
2009-03-06 13:38
|
To fix the problem in the mean time, I've changed line 970 of usersettings.php to this:
790 $username = str_replace ('?', '_', $_USER['username']);
791 $username = str_replace ('!', '_', $username);
792 $filename = $username . '.' . $fextension; |
|
(0000548)
dhaun (administrator)
2009-03-08 12:47
|
Okay. Or we could run the username through COM_sanitizeFilename - but that would pose the risk of duplicate file names. Actually, whatever replacement method we use, there's always the risk of duplicate file names ... |
|
(0000549)
jmucchiello (reporter)
2009-03-08 15:20
|
urlencode the username then you know the filename is url safe. |
|
(0000550)
monoclast (reporter)
2009-03-08 16:16
|
My only concern with doing that is:
Are URL encoding sequences like %20 valid in file names on *nix, Mac, and Windows? If not, then I guess I would rather see Geeklog do a simple substitution like my work around solution to avoid problem characters specifically. |
|
(0000551)
jmucchiello (reporter)
2009-03-08 18:40
|
If that's the case, you can specifically convert % to something else. For example:
$username = str_replace(Array('_','%'), Array('%5f','_'), urlencode($username));
So all special url chars are converted to %nn. Then _ is converted to %5f (which I think is _) and then % is converted _. To reverse it:
$username = urldecode(str_replace(Array('_','%5f'), Array('%','_'), basename($filename));
There should be no collisions in this case since if the username contains %, it converts it first to %25 then to _25. |
|
(0000605)
tymoteusz3 (developer)
2009-03-29 00:15
edited on: 2009-03-29 01:02
|
A simple fix for this would be to use rawurlencode() function.
Since in all OS's the % (percent) symbol is a valid file name character, this function will convert all non-alphanumeric characters (excluding _-) to a %## value where ## is a hex number.
rawurlencode is used instead of urlencode, as urlencode encodes spaces (ascii 32) as a + symbol which is an invalid file character on windows.
I uploaded the fixed file usersettings.php..
Tim Patrick
|
|
(0000624)
dhaun (administrator)
2009-04-05 07:16
|
rawurlencode() looks like the way to go, but there are a few more places where the user photo is accessed that would also need to be changed (e.g. admin/user.php). Ideally, we would want to hide the rawurlencode calls in lib-user.php so that they're not spread all over the code. |
|
(0000625)
tymoteusz3 (developer)
2009-04-05 10:08
|
Yes I figured as much.. Maybe the best idea would just be to rawurlencode the username when it is saved in the database...
Tim |
|
(0000657)
tymoteusz3 (developer)
2009-04-13 10:55
|
Looked around a bit, the best place to use this function is in the USER_createAccount function. This means that as soon as the user creates the name it is automatically encoded so in the database it will be the 'safe' version. So no more conversions anywhere else in the site.
On line 254 change to:
$username = addslashes (rawurlencode($username));
- Tim |
|
(0000658)
tymoteusz3 (developer)
2009-04-13 10:56
|
Forgot to note line 254 of lib-user.php |
|
(0000661)
m_c (reporter)
2009-04-13 15:40
edited on: 2009-04-13 15:41
|
Wouldn't it be better to accept only the usernames, that match a certain regex?
For example ^[:word:]{4,32}$ in POSIX (same as ^[A-Za-z0-9_]{4,32}$ in ASCII).
It's like a common standard, so most people already use a nick of this type somewhere (UNIX, mail, forums and such).
All the url and filename safe-proofing issues magically dissapear.
|
|
(0000662)
dhaun (administrator)
2009-04-13 16:27
|
Actually, the last time I suggested restricting usernames like that, our friends from geeklog.jp were quick to point out that using Japanese characters in usernames is common there:
http://eight.pairlist.net/pipermail/geeklog-devel/2008-September/003883.html
Not to mention the problem of upgrading existing sites. Geeklog.net sure has a lot of users with "funky" usernames ... |
|
(0000663)
m_c (reporter)
2009-04-13 17:39
|
Well, still they can't use UTF-8 in their e-mail adresses, though. Anyway, that was a little bit selfish of me.
How about naming the files and urls (or whatever) with UID, instead of username? |
|
(0000664)
jmucchiello (reporter)
2009-04-13 17:41
|
Actually it might make more sense to deny certain characters rather than limit the allowed character set. So eliminate (not a complete list) %&@(){}[];~`'"+ and chr(32) or less. And then make this list configurable, of course. |
|
(0000665)
tymoteusz3 (developer)
2009-04-13 20:59
|
But simply using rawurlencode once will do the trick. (When the user signs up)
And for existing users, a patch can be created that will change the stored usernames, and associated files.
This while a bit more work, will be worth the payoff as no limiting character sets will have to be used |
|
(0000686)
hallomarkus (reporter)
2009-04-23 04:09
|
Just a side hint: I don`t rember if there was a discussion on "reserved user names". That would also be a good thing to block unwanted user names. |
|
(0000888)
monoclast (reporter)
2009-07-23 09:56
|
I'm not so sure storing the URL encoded username in the database (by modification of line 254 of lib-user.php) is really the right way to go. Wouldn't that mean that sometimes when we pull the username from the DB to use it, we need to first decode it? For instance, what happens when a user sends an email to another user? Is the username in the email going to be URL encoded? That wouldn't be right. Has anyone thought about what other cases there might be like this?
From a purist standpoint, it seems to me the username in the database should be the simple unencoded username so that it can be easily used by anyone, and those routines wishing to use the username to create filenames, use it in URLs, and so on should encode it as needed. |
|