No description
Find a file
Roland Scheidegger f4821daed1 llvmpipe: do transpose/untwiddle after conversion for 8bit formats
Generally we should do tranpose after conversion, if the format has less than
32 bits per channel (if it has 32 bits, conversion is going to be a no-op
anyway...). This is obviously because there's less vectors to deal with.
Though the advantage for 16 bit formats isn't that big, and in fact with AVX
there isn't really any (as the 32bit unpacks can be done with 256bit, but
the smaller ones cannot, although that would change again with proper AVX2
support).
Only makes sense for 2d and not 1d cases. And to keep things easy, only handle
1,2 and 4 channels (rgbx is just fine).
For rgba unorm8 format the backend conversion sums up to these instruction
totals (not counting the movs for SSE2 due to 2-op syntax - generally every 2
unpacks need an additional mov).
                     SSE2                    AVX
transpose:           32 unpack               16 unpack
untwiddle:           0                       8 (128bit low/high permutes)
convert:             16 mul + 16 cvt         8 mul + 8 cvt
32->8bit:            12 pack                 8 (128bit extract) + 12 pack

When doing transpose/untwiddle afterwards we get:
convert:             16 mul + 16 cvt         8 mul + 8 cvt
32->8bit:            12 pack                 8 (128bit extract) + 12 pack
transpose/untwiddle  12 unpack               12 unpack

So for SSE2, this drops 20 unpacks (total instruction count 76->56)
whereas for AVX it replaces the 16 256bit unpacks with 8 128bit ones
and drops the 8 lo/hi permutes (in total 60->48). (Albeit to be fair,
the permutes could be dropped even when doing the transpose first,
they are extremely pointless but we'd need to be able to tell
lp_build_conv to reorder the vectors, for AVX2 we're going to need to
be able to tell lp_build_conv about ordering in any case.)

(With different ordering going into conversion, it would be possible
to do 4 unpacks + 4 pshufbs instead of 12 unpacks, but that might not
be better, and not all cpus can do it. Proper AVX2 support should eliminate
the 8 128bit extracts, reduce these 12 packs to 6 and the 12 unpacks to 2
pshufb + 2 permq ideally (+ 2 final 128bit extracts).)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-06 23:13:34 +01:00
bin Introduce .editorconfig 2016-08-31 17:06:54 -07:00
docs i965: Enable several GLES 3.1 extensions on HSW+ 2017-01-06 12:42:43 -08:00
doxygen
include dri: Add __DRI_IMAGE_FORMAT_ARGB1555 2016-12-27 09:13:43 -08:00
m4
scons scons: Recognize LLVM_CONFIG environment variable. 2016-11-24 13:37:33 -08:00
scripts get_reviewer.pl: fix mesa check 2016-08-30 16:44:00 -04:00
src llvmpipe: do transpose/untwiddle after conversion for 8bit formats 2017-01-06 23:13:34 +01:00
.dir-locals.el dir-locals.el: Adds White Space support 2016-11-14 19:17:49 +02:00
.editorconfig editorconfig: Fix up the tab rendering width. 2017-01-03 10:38:53 -08:00
.gitattributes
.gitignore
.mailmap
.travis.yml travis: remove no longer needed libudev-dev dependency 2016-10-18 17:06:24 +01:00
Android.common.mk android: avoid using libdrm with host modules 2016-11-02 14:43:26 +00:00
Android.mk android: add support for libmesa_amdgpu_addrlib 2016-09-13 10:06:04 +10:00
appveyor.yml appveyor: Update winflexbison download URL. 2016-09-13 17:54:51 +01:00
autogen.sh
CleanSpec.mk
common.py scons: Recognize LLVM_CONFIG environment variable. 2016-11-24 13:37:33 -08:00
configure.ac configure: Fix another bashism. 2017-01-05 09:24:28 -08:00
install-gallium-links.mk gallium: Fix install-gallium-links.mk on non-bash /bin/sh 2016-10-10 08:56:12 -07:00
install-lib-links.mk
Makefile.am automake: don't forget to pick wglext.h in the tarball 2016-10-24 09:44:26 +01:00
REVIEWERS reviewers: add Rob H for the Android EGL+build parts 2016-11-21 16:01:06 +00:00
SConstruct
VERSION docs: add 13.1.0-devel release notes template, bump version 2016-10-19 19:10:16 +01:00

File: docs/README.WIN32

Last updated: 21 June 2013


Quick Start
----- -----

Windows drivers are build with SCons.  Makefiles or Visual Studio projects are
no longer shipped or supported.

Run

  scons libgl-gdi

to build gallium based GDI driver.

This will work both with MSVS or Mingw.


Windows Drivers
------- -------

At this time, only the gallium GDI driver is known to work.

Source code also exists in the tree for other drivers in
src/mesa/drivers/windows, but the status of this code is unknown.

Recipe
------

Building on windows requires several open-source packages. These are
steps that work as of this writing.

- install python 2.7
- install scons (latest)
- install mingw, flex, and bison
- install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
  get pywin32-218.4.win-amd64-py2.7.exe
- install git
- download mesa from git
  see http://www.mesa3d.org/repository.html
- run scons

General
-------

After building, you can copy the above DLL files to a place in your
PATH such as $SystemRoot/SYSTEM32.  If you don't like putting things
in a system directory, place them in the same directory as the
executable(s).  Be careful about accidentially overwriting files of
the same name in the SYSTEM32 directory.

The DLL files are built so that the external entry points use the
stdcall calling convention.

Static LIB files are not built.  The LIB files that are built with are
the linker import files associated with the DLL files.

The si-glu sources are used to build the GLU libs.  This was done
mainly to get the better tessellator code.

If you have a Windows-related build problem or question, please post
to the mesa-dev or mesa-users list.