View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|14091||Bug reports||Survey taking||public||2018-09-24 16:21||2021-11-12 10:45|
|Summary||14091: Filenames of uploads starting with special characters truncated with invalid setlocale|
|Description||Non-ANSI characters (e.g. umlauts (äöü) ) at the beginning of uploaded files are cut off.|
|Steps To Reproduce||When taking a survey that contains a file upload field:|
1. Choose a file with an umlaut at the beginning
2. See file with umlaut at the beginning cut off
|Additional Information||This is due to locale-awareness of PHP's basename().|
Also see https://stackoverflow.com/questions/45268499/php-basename-and-pathinfo-with-multibytes-utf-8-file-names/45268539#45268539 for other related functions with the same feature.
in index.php fixes it, but I am sure this is not the right place to add this.
My system locale (on debian) is correctly set (apart from LC_ALL, which normally is just an override and should not be set).
Please either respect system's LANG or set the locale to some UTF-8 compatible one.
|Tags||No tags attached.|
|Complete LimeSurvey version number (& build)||3.27.23|
|I will donate to the project if issue is resolved||No|
|Database type & version||mySQL 5.5.60|
|Server OS (if known)||debian|
|Webserver software & version (if known)||apache 2.4.10 or nginx|
|PHP Version||5.6.36 or 7.4.25|
|yes we have hard filtering of file names for security reason.|
I understand file names get filtered, but this does not seem related. Why does setting the correct locale circumvent the filtering?
I feel like filtering should be done after correctly interpreting the code points with the correct locale.
But I may be wrong.
If only ASCII (or even a more reduced subset) is allowed, a note should advise users to rename their files (and only list allowed characters). Though this seems pretty uncommon. There should be standard functions to sanitize the string and frankly, umlauts don't seem all too dangerous :)
«yes we have hard filtering of file names for security reason. » ???
Filtering äöü have nothing with security , andis not filtered like this in 'Upload question' type.
@tbart : where it's filtered like this ? And what is the broken locale
Your filenames do not start with special characters. See screenshot attached.
The locale should be en_US.UTF-8, at least that's what the environment sets for LANG:
# set | egrep "^(LANG|LC)"
My filename was äöü.png ;) , but my locale is OK.
Can show your current locale in PHP ?
Sorry, I misread your screenshot and thought the left column were the filenames, my bad.
phpinfo() gives me:
It's still strange why only umlauts at the beginning get stripped off. This should not be locale dependent!
|Please update to the latest version and check if the bug can still be reproduced. Thank you.|
I can confirm this is still broken in Version 3.25.17+210309.
Attached is a sample survey+file. Uploading the file results in a file name "sterreich.png"
in index.php still fixes it.
limesurvey_survey_185162.lss (13,472 bytes)
@ollehar did this need a fix here ?
Server issue or not ?
Else : simple fix :
I don't have way to reproduce : need to broke my PHP ;)
|@tbart : can you check with ONLY setlocale(LC_CTYPE, 'en_US.UTF8'); ?|
Way to reproduce : config.php : setlocale(LC_CTYPE,"C");
Must check if i can fix without setlocale (f**king crosoft and unsure en.utf8 exist).
LC_TYPE to C
$filename = österreich.png
basename($filename) = sterreich.png
> basename() is locale aware, so for it to see the correct basename with multibyte character paths, the matching locale must be set using the setlocale() function. If path contains characters which are invalid for the current locale, the behavior of basename() is undefined.
This is on a Debian and/or Ubuntu server.
I just found out that as a default, they set LANG to C in /etc/apache2/envvars.
(with 3 closing brackets :-) ) is not true, so this does not help.
alone does help as well.
On a not so unrelated sidenote:
This only works as long as Apache does not have mod_perl2 enabled as well, starting on Ubuntu 20.04 at least. See https://bugs.php.net/bug.php?id=81596 for details.
It seems to be generally discouraged to setlocale() in an application, so I am unsure as to whether my suggested "fix" for this issue is good or not. Maybe I did not understand nikic correctly in that thread and it is only discouraged for strftime()?
However, as Debian/Ubuntu(/most likely all other Debian based distros) - which will make up a huge percentage of all limesurvey installations I guess - set "C" as their default domain, some fix needs to be devised.
Maybe a documentation hint is necessary (mentioning the mod_perl2 issue would be practical as well, it took be hours to find this)?
Setting the locale via /etc/apache2/envvars is not thoroughly tested by me, I cannot state whether this is an actual fix that works without any side effects.
How is your/everyone else's server setup so it seems it does not use "C" as a default locale?
What I still don't get is why utf-8 characters in the middle of a filename work with the "C" locale and those at the beginning don't. This really seems like a limesurvey bug, still.
Oups, link with pull request not done
|Discussion ongoing in github|
Add related : 17718: sanitize_filename didn't really fix filename for all system
|2018-09-24 16:21||tbart||New Issue|
||Assigned To||=> LouisGac|
||Status||new => closed|
||Resolution||open => no change required|
||Note Added: 50169|
|2019-01-14 17:40||tbart||Note Added: 50188|
|2019-01-15 08:40||DenisChenu||Status||closed => feedback|
|2019-01-15 08:40||DenisChenu||Resolution||no change required => reopened|
|2019-01-15 08:40||DenisChenu||Note Added: 50192|
|2019-01-15 08:40||DenisChenu||File Added: Capture d’écran du 2019-01-15 08-40-13.png|
|2019-01-15 08:51||DenisChenu||Note Edited: 50192||View Revisions|
|2019-01-15 09:06||tbart||File Added: umlauts_at_the_beginning_cut_off.png|
|2019-01-15 09:06||tbart||Note Added: 50194|
|2019-01-15 09:06||tbart||Status||feedback => assigned|
|2019-01-15 10:00||DenisChenu||Note Added: 50205|
|2019-01-15 10:37||tbart||Note Added: 50207|
|2021-03-10 22:43||ollehar||Assigned To||LouisGac =>|
|2021-03-10 22:43||ollehar||Status||assigned => feedback|
|2021-03-10 22:43||ollehar||Note Added: 63212|
|2021-03-11 10:40||tbart||Note Added: 63290|
|2021-03-11 10:40||tbart||File Added: limesurvey_survey_185162.lss|
|2021-03-11 10:40||tbart||File Added: österreich.png|
|2021-03-11 10:40||tbart||Status||feedback => new|
|2021-03-11 11:38||ollehar||Priority||none => high|
|2021-10-07 14:11||ollehar||Assigned To||=> ollehar|
|2021-10-07 14:11||ollehar||Status||new => acknowledged|
|2021-11-04 12:10||DenisChenu||Assigned To||ollehar => DenisChenu|
|2021-11-04 12:14||DenisChenu||Note Added: 67112|
|2021-11-04 12:18||DenisChenu||Note Added: 67113|
|2021-11-04 15:41||DenisChenu||Note Added: 67116|
|2021-11-04 15:42||DenisChenu||Status||acknowledged => confirmed|
|2021-11-04 15:42||DenisChenu||Complete LimeSurvey version number (& build)||3.14.9+180917 => 3.27.23|
|2021-11-04 15:43||DenisChenu||Summary||Filenames of uploads starting with special characters truncated => Filenames of uploads starting with special characters truncated with invalid setlocale|
|2021-11-04 15:43||DenisChenu||Webserver software & version (if known)||apache 2.4.10 => apache 2.4.10 or nginx|
|2021-11-04 15:43||DenisChenu||PHP Version||5.6.36 => 5.6.36 or 7.4.25|
|2021-11-04 15:47||DenisChenu||Note Added: 67118|
|2021-11-04 15:48||DenisChenu||Note Added: 67119|
|2021-11-10 11:23||tbart||Note Added: 67219|
|2021-11-10 12:00||DenisChenu||Note Added: 67226|
|2021-11-12 10:44||galads||Note Added: 67273|
|2021-11-12 10:44||galads||Bug heat||8 => 10|
|2021-11-12 10:45||DenisChenu||Relationship added||related to 17718|
|2021-11-12 10:45||DenisChenu||Note Added: 67274|