[PATCH 2 of 2 stable] hgweb: encode WSGI environment like OS environment

Yuya Nishihara yuya at tcha.org
Thu Jun 25 13:44:17 UTC 2020


On Thu, 25 Jun 2020 05:14:00 +0200, Manuel Jacob wrote:
> # HG changeset patch
> # User Manuel Jacob <me at manueljacob.de>
> # Date 1593049567 -7200
> #      Thu Jun 25 03:46:07 2020 +0200
> # Branch stable
> # Node ID c115cca2d19d55c2538def5c95a68ceff597f45d
> # Parent  8f730a30fb20a104bbf5665e1f7d0d4e4aaedf6f
> # EXP-Topic cgi_env_encoding
> hgweb: encode WSGI environment like OS environment
> 
> Previously, the WSGI environment keys and values were encoded using latin-1.
> This resulted in a crash if a WSGI environment key or value could not be encoded
> using latin-1.
> 
> On Unix, the OS environment is byte-based. Therefore we should do the reverse of
> what Python does for os.environ.
> 
> On Windows, there’s no native byte-based OS environment. Therefore we should do
> the same as what mercurial.encoding does with the OS environment.
> 
> diff --git a/mercurial/hgweb/request.py b/mercurial/hgweb/request.py
> --- a/mercurial/hgweb/request.py
> +++ b/mercurial/hgweb/request.py
> @@ -8,10 +8,13 @@
>  
>  from __future__ import absolute_import
>  
> +import sys
> +
>  # import wsgiref.validate
>  
>  from ..thirdparty import attr
>  from .. import (
> +    encoding,
>      error,
>      pycompat,
>      util,
> @@ -162,10 +165,18 @@
>      # strings on Python 3 must be between \00000-\000FF. We deal with bytes
>      # in Mercurial, so mass convert string keys and values to bytes.
>      if pycompat.ispy3:
> +        fsencoding = sys.getfilesystemencoding()
> +
>          def tobytes(s):
>              if not isinstance(s, str):
>                  return s
> -            return s.encode('latin-1')
> +            if pycompat.iswindows:
> +                # This is what mercurial.encoding does for os.environ on Windows.
> +                return encoding.strtolocal(s)
> +            else:
> +                # This is what is documented to be used for os.environ on Unix.
> +                return s.encode(fsencoding, 'surrogateescape')

This can be pycompat.fsencode(), which I think is more widely used in our
codebase.



More information about the Mercurial-devel mailing list