Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] Python thinks encoding is ANSI_X3.4-1968; not utf8
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
slycordinator
Advocate
Advocate


Joined: 31 Jan 2004
Posts: 3065
Location: Korea

PostPosted: Wed Oct 16, 2019 5:49 am    Post subject: [SOLVED] Python thinks encoding is ANSI_X3.4-1968; not utf8 Reply with quote

python3 is borking on utf-8 files, giving that the ascii codec can't decode them and when I check the default encoding that python thinks I have, it gives ANSI_X3.4-1968.

Code:
# python3 -c "import locale; print(locale.getpreferredencoding(False))"
ANSI_X3.4-1968


My locale is set as utf-8
Code:
# eselect locale list
Available targets for the LANG variable:
  [1]   C
  [2]   C.utf8
  [3]   en_US
  [4]   en_US.ansix341968
  [5]   en_US.utf8 *
  [6]   ko_KR.euckr
  [7]   ko_KR.utf8
  [8]   POSIX
  [ ]   (free form)


Code:
# locale
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=en_US.utf8


Code:
# cat /etc/env.d/02locale
# Configuration file for eselect
# This file has been automatically generated.
LANG="en_US.utf8"
LC_ALL="en_US.utf8"


The LC_ALL setting was added by hand and after I ran "env-update && source /etc/profile". I read that the setting could affect this.
But the output/errors are the same before and after. python still thinks my default encoding is "ANSI" instead of uft8
_________________
My political stance/bias
slycordinator != slycoordinator


Last edited by slycordinator on Thu Oct 17, 2019 8:56 am; edited 1 time in total
Back to top
View user's profile Send private message
slycordinator
Advocate
Advocate


Joined: 31 Jan 2004
Posts: 3065
Location: Korea

PostPosted: Wed Oct 16, 2019 10:19 am    Post subject: Reply with quote

So, I created a 1-line file that's utf-8 (just random letters plus some random Korean characters). python thinks it's encoded as ANSI.

Code:
# file blah
blah: UTF-8 Unicode text
# python3
Python 3.6.9 (default, Oct 10 2019, 00:27:28)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> f=open('blah','r')
>>> f.encoding
'ANSI_X3.4-1968'


Code:
# python3 -c 'import locale; print(locale.getdefaultlocale())'
('en_US', 'UTF-8')

# python3 -c 'import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968

_________________
My political stance/bias
slycordinator != slycoordinator
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 4438
Location: Frankfurt, Germany

PostPosted: Wed Oct 16, 2019 12:27 pm    Post subject: Reply with quote

A wild guess: did you install python without USE flag 'wide-unicode'?
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3151

PostPosted: Wed Oct 16, 2019 4:43 pm    Post subject: Reply with quote

Python is windows-retarded when it comes to guessing encoding.
I've hit this problem a long long time ago....
My program would work just fine when attached to the terminal, and crash on the first accented letter in any other case (file, pipe, etc).
The solution from SO or something like that was to unload some module and then load it again... And then it would output utf-8 to that file, so I had finally had accents without crashes.
Back to top
View user's profile Send private message
slycordinator
Advocate
Advocate


Joined: 31 Jan 2004
Posts: 3065
Location: Korea

PostPosted: Wed Oct 16, 2019 11:37 pm    Post subject: Reply with quote

mike155 wrote:
A wild guess: did you install python without USE flag 'wide-unicode'?
There is no 'wide-unicode' USE flag for python3
_________________
My political stance/bias
slycordinator != slycoordinator
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 4438
Location: Frankfurt, Germany

PostPosted: Thu Oct 17, 2019 12:49 am    Post subject: Reply with quote

slycordinator wrote:
There is no 'wide-unicode' USE flag for python3

You're right. Only Python 2 has this USE flag. I'm sorry!
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Thu Oct 17, 2019 1:31 am    Post subject: Reply with quote

I get ANSI_X3.4-1968 if I set LC_ALL to an invalid value. Have you run locale-gen recently?
Back to top
View user's profile Send private message
slycordinator
Advocate
Advocate


Joined: 31 Jan 2004
Posts: 3065
Location: Korea

PostPosted: Thu Oct 17, 2019 6:03 am    Post subject: Reply with quote

I see I'm getting "failed to set locale" and "not found: no such file or directory" output upon running locale-gen.

And for the secondary locales, it gives weird "unknown character" errors.

I'll see if rebuilding glibc makes it work.

Definitely strange that locale-gen gives "success" while outputting error upon error.
_________________
My political stance/bias
slycordinator != slycoordinator
Back to top
View user's profile Send private message
slycordinator
Advocate
Advocate


Joined: 31 Jan 2004
Posts: 3065
Location: Korea

PostPosted: Thu Oct 17, 2019 6:11 am    Post subject: Reply with quote

I see what happened.

In copying over files that I edited on a windows box, the locale.gen file got windows line endings and somehow locale-gen went looking for the files including the unprintable characters.

It's all good now. Although, it's still problematic that locale-gen succeeded with errors galore.
_________________
My political stance/bias
slycordinator != slycoordinator
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum