Улучшения сборки, небольшие изменения в коде

This commit is contained in:
justuser 2025-01-25 11:24:37 +03:00
parent 8535d05433
commit 4248bf4f5b
3300 changed files with 364904 additions and 27 deletions

63
busybox-1_37_0/.gitignore vendored Normal file
View File

@ -0,0 +1,63 @@
#
# Kbuild ignores
#
.*
*.o
*.o.*
*.a
*.s
Kbuild
Config.in
#
# Never ignore these
#
!.gitignore
#
# Normal output
#
/busybox
/busybox_old
/busybox_unstripped*
#
# Backups / patches
#
*~
*.orig
*.rej
/*.patch
#
# debugging stuff
#
core
.gdb_history
.gdbinit
#
# testing output
#
/busybox.links
/runtest-tempdir-links
/testsuite/echo-ne
#
# cscope output
#
cscope.files
cscope.in.out
cscope.out
cscope.po.out
#
# ctags output
#
tags
TAGS
#
# user-supplied scripts
#
/embed

186
busybox-1_37_0/AUTHORS Normal file
View File

@ -0,0 +1,186 @@
List of the authors of code contained in BusyBox.
If you have code in BusyBox, you should be listed here. If you should be
listed, or the description of what you have done needs more detail, or is
incorrect, _please_ let me know.
-Erik
-----------
Peter Willis <psyphreak@phreaker.net>
eject
Emanuele Aina <emanuele.aina@tiscali.it>
run-parts
Erik Andersen <andersen@codepoet.org>
Tons of new stuff, major rewrite of most of the
core apps, tons of new apps as noted in header files.
Lots of tedious effort writing these boring docs that
nobody is going to actually read.
Laurence Anderson <l.d.anderson@warwick.ac.uk>
rpm2cpio, unzip, get_header_cpio, read_gz interface, rpm
Jeff Angielski <jeff@theptrgroup.com>
ftpput, ftpget
Enrik Berkhan <Enrik.Berkhan@inka.de>
setconsole
Jim Bauer <jfbauer@nfr.com>
modprobe shell dependency
Edward Betts <edward@debian.org>
expr, hostid, logname, whoami
John Beppu <beppu@codepoet.org>
du, nslookup, sort
David Brownell <dbrownell@users.sourceforge.net>
zcip
Brian Candler <B.Candler@pobox.com>
tiny-ls(ls)
Randolph Chung <tausq@debian.org>
fbset, ping, hostname
Dave Cinege <dcinege@psychosis.com>
more(v2), makedevs, dutmp, modularization, auto links file,
various fixes, Linux Router Project maintenance
Jordan Crouse <jordan@cosmicpenguin.net>
ipcalc
Magnus Damm <damm@opensource.se>
tftp client
insmod powerpc support
Larry Doolittle <ldoolitt@recycle.lbl.gov>
pristine source directory compilation, lots of patches and fixes.
Glenn Engel <glenne@engel.org>
httpd
Gennady Feldman <gfeldman@gena01.com>
Sysklogd (single threaded syslogd, IPC Circular buffer support,
logread), various fixes.
Robert Griebl <sandman@handhelds.org>
modprobe, hwclock, suid/sgid handling, tinylogin integration
many bugfixes and enhancements
Karl M. Hegbloom <karlheg@debian.org>
cp_mv.c, the test suite, various fixes to utility.c, &c.
Daniel Jacobowitz <dan@debian.org>
mktemp.c
Matt Kraai <kraai@alumni.cmu.edu>
documentation, bugfixes, test suite
Rob Landley <rob@landley.net>
Became busybox maintainer in 2006.
sed (major rewrite in 2003, and I now maintain the thing)
bunzip2 (complete from-scratch rewrite, then mjn3 optimized the result)
sort (more or less from scratch rewrite in 2004, I now maintain it)
mount (rewrite in 2005, I maintain the new one)
Stephan Linz <linz@li-pro.net>
ipcalc, Red Hat equivalence
John Lombardo <john@deltanet.com>
tr
Glenn McGrath <glenn.l.mcgrath@gmail.com>
Common unarchiving code and unarchiving applets, ifupdown, ftpgetput,
nameif, sed, patch, fold, install, uudecode.
Various bugfixes, review and apply numerous patches.
Manuel Novoa III <mjn3@codepoet.org>
cat, head, mkfifo, mknod, rmdir, sleep, tee, tty, uniq, usleep, wc, yes,
mesg, vconfig, nice, renice,
make_directory, parse_mode, dirname, mode_string,
get_last_path_component, simplify_path, and a number trivial libbb routines
also bug fixes, partial rewrites, and size optimizations in
ash, basename, cal, cmp, cp, df, du, echo, env, ln, logname, md5sum, mkdir,
mv, realpath, rm, sort, tail, touch, uname, watch, arith, human_readable,
interface, dutmp, ifconfig, route
Vladimir Oleynik <dzo@simtreas.ru>
cmdedit; bb_mkdep, xargs(current), httpd(current);
ports: ash, crond, fdisk (initial, unmaintained now), inetd, stty, traceroute,
top;
locale, various fixes
and irreconcilable critic of everything not perfect.
Bruce Perens <bruce@pixar.com>
Original author of BusyBox in 1995, 1996. Some of his code can
still be found hiding here and there...
Rodney Radford <rradford@mindspring.com>
ipcs, ipcrm
Tim Riker <Tim@Rikers.org>
bug fixes, member of fan club
Kent Robotti <robotti@metconnect.com>
reset, tons and tons of bug reports and patches.
Chip Rosenthal <chip@unicom.com>, <crosenth@covad.com>
wget - Contributed by permission of Covad Communications
Pavel Roskin <proski@gnu.org>
Lots of bugs fixes and patches.
Gyepi Sam <gyepi@praxis-sw.com>
Remote logging feature for syslogd
Rob Sullivan <cogito.ergo.cogito@gmail.com>
comm
Linus Torvalds
mkswap, fsck.minix, mkfs.minix
Linus Walleij
fbset and fbsplash config RGBA parsing
rewrite of mdev helper to create devices from /sys/dev
Mark Whitley <markw@codepoet.org>
grep, sed, cut, xargs(previous),
style-guide, new-applet-HOWTO, bug fixes, etc.
Charles P. Wright <cpwright@villagenet.com>
gzip, mini-netcat(nc)
Enrique Zanardi <ezanardi@ull.es>
tarcat (since removed), loadkmap, various fixes, Debian maintenance
Tito Ragusa <farmatito@tiscali.it>
devfsd and size optimizations in strings, openvt, chvt, deallocvt, hdparm,
fdformat, lsattr, chattr, id and eject.
Paul Fox <pgf@foxharp.boston.ma.us>
vi editing mode for ash, various other patches/fixes
Roberto A. Foglietta <me@roberto.foglietta.name>
port: dnsd
Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
misc
Mike Frysinger <vapier@gentoo.org>
initial e2fsprogs, printenv, setarch, sum, misc
Jie Zhang <jie.zhang@analog.com>
fixed two bugs in msh and hush (exitcode of killed processes)
Maxime Coste <mawww@kakoune.org>
paste implementation
Roger Knecht <rknecht@pm.me>
tree

142
busybox-1_37_0/INSTALL Normal file
View File

@ -0,0 +1,142 @@
Building:
=========
The BusyBox build process is similar to the Linux kernel build:
make menuconfig # This creates a file called ".config"
make # This creates the "busybox" executable
make install # or make CONFIG_PREFIX=/path/from/root install
The full list of configuration and install options is available by typing:
make help
Quick Start:
============
The easy way to try out BusyBox for the first time, without having to install
it, is to enable all features and then use "standalone shell" mode with a
blank command $PATH.
To enable all features, use "make defconfig", which produces the largest
general-purpose configuration. It's allyesconfig minus debugging options,
optional packaging choices, and a few special-purpose features requiring
extra configuration to use. Then enable "standalone shell" feature:
make defconfig
make menuconfig
# select Busybox Settings
# then General Configuration
# then exec prefers applets
# exit back to top level menu
# select Shells
# then Standalone shell
# exit back to top level menu
# exit and save new configuration
# OR
# use these commands to modify .config directly:
sed -e 's/.*FEATURE_PREFER_APPLETS.*/CONFIG_FEATURE_PREFER_APPLETS=y/' -i .config
sed -e 's/.*FEATURE_SH_STANDALONE.*/CONFIG_FEATURE_SH_STANDALONE=y/' -i .config
make
PATH= ./busybox ash
Standalone shell mode causes busybox's built-in command shell to run
any built-in busybox applets directly, without looking for external
programs by that name. Supplying an empty command path (as above) means
the only commands busybox can find are the built-in ones.
Note that the standalone shell requires CONFIG_BUSYBOX_EXEC_PATH
to be set appropriately, depending on whether or not /proc/self/exe is
available. If you do not have /proc, then point that config option
to the location of your busybox binary, usually /bin/busybox.
Another solution is to patch the kernel (see
examples/linux-*_proc_self_exe.patch) to make exec("/proc/self/exe")
always work.
Configuring Busybox:
====================
Busybox is optimized for size, but enabling the full set of functionality
still results in a fairly large executable -- more than 1 megabyte when
statically linked. To save space, busybox can be configured with only the
set of applets needed for each environment. The minimal configuration, with
all applets disabled, produces a 4k executable. (It's useless, but very small.)
The manual configurator "make menuconfig" modifies the existing configuration.
(For systems without ncurses, try "make config" instead.) The two most
interesting starting configurations are "make allnoconfig" (to start with
everything disabled and add just what you need), and "make defconfig" (to
start with everything enabled and remove what you don't need). If menuconfig
is run without an existing configuration, make defconfig will run first to
create a known starting point.
Other starting configurations (mostly used for testing purposes) include
"make allbareconfig" (enables all applets but disables all optional features),
"make allyesconfig" (enables absolutely everything including debug features),
and "make randconfig" (produce a random configuration). The configs/ directory
contains a number of additional configuration files ending in _defconfig which
are useful in specific cases. "make help" will list them.
Configuring BusyBox produces a file ".config", which can be saved for future
use. Run "make oldconfig" to bring a .config file from an older version of
busybox up to date.
Installing Busybox:
===================
Busybox is a single executable that can behave like many different commands,
and BusyBox uses the name it was invoked under to determine the desired
behavior. (Try "mv busybox ls" and then "./ls -l".)
Installing busybox consists of creating symlinks (or hardlinks) to the busybox
binary for each applet enabled in busybox, and making sure these symlinks are
in the shell's command $PATH. Running "make install" creates these symlinks,
or "make install-hardlinks" creates hardlinks instead (useful on systems with
a limited number of inodes). This install process uses the file
"busybox.links" (created by make), which contains the list of enabled applets
and the path at which to install them.
Installing links to busybox is not always necessary. The special applet name
"busybox" (or with any optional suffix, such as "busybox-static") uses the
first argument to determine which applet to behave as, for example
"./busybox cat LICENSE". (Running the busybox applet with no arguments gives
a list of all enabled applets.) The standalone shell can also call busybox
applets without links to busybox under other names in the filesystem. You can
also configure a standalone install capability into the busybox base applet,
and then install such links at runtime with one of "busybox --install" (for
hardlinks) or "busybox --install -s" (for symlinks).
If you enabled the busybox shared library feature (libbusybox.so) and want
to run tests without installing, set your LD_LIBRARY_PATH accordingly when
running the executable:
LD_LIBRARY_PATH=`pwd` ./busybox
Building out-of-tree:
=====================
By default, the BusyBox build puts its temporary files in the source tree.
Building from a read-only source tree, or building multiple configurations from
the same source directory, requires the ability to put the temporary files
somewhere else.
To build out of tree, cd to an empty directory and configure busybox from there:
make KBUILD_SRC=/path/to/source -f /path/to/source/Makefile defconfig
make
make install
Alternately, use the O=$BUILDPATH option (with an absolute path) during the
configuration step, as in:
make O=/some/empty/directory allyesconfig
cd /some/empty/directory
make
make CONFIG_PREFIX=. install
More Information:
=================
Se also the busybox FAQ, under the questions "How can I get started using
BusyBox" and "How do I build a BusyBox-based system?" The BusyBox FAQ is
available from http://www.busybox.net/FAQ.html

348
busybox-1_37_0/LICENSE Normal file
View File

@ -0,0 +1,348 @@
--- A note on GPL versions
BusyBox is distributed under version 2 of the General Public License (included
in its entirety, below). Version 2 is the only version of this license which
this version of BusyBox (or modified versions derived from this one) may be
distributed under.
------------------------------------------------------------------------
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.

1329
busybox-1_37_0/Makefile Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,201 @@
# ==========================================================================
# Build system
# ==========================================================================
busybox.links: $(srctree)/applets/busybox.mkll $(objtree)/include/autoconf.h include/applets.h
$(Q)-$(SHELL) $^ > $@
busybox.cfg.suid: $(srctree)/applets/busybox.mksuid $(objtree)/include/autoconf.h include/applets.h
$(Q)-SUID="yes" $(SHELL) $^ > $@
busybox.cfg.nosuid: $(srctree)/applets/busybox.mksuid $(objtree)/include/autoconf.h include/applets.h
$(Q)-SUID="DROP" $(SHELL) $^ > $@
.PHONY: install
ifeq ($(CONFIG_INSTALL_APPLET_DONT),y)
INSTALL_OPTS:= --none
endif
ifeq ($(CONFIG_INSTALL_APPLET_SYMLINKS),y)
INSTALL_OPTS:= --symlinks
endif
ifeq ($(CONFIG_INSTALL_APPLET_HARDLINKS),y)
INSTALL_OPTS:= --hardlinks
endif
ifeq ($(CONFIG_INSTALL_APPLET_SCRIPT_WRAPPERS),y)
ifeq ($(CONFIG_INSTALL_SH_APPLET_SYMLINK),y)
INSTALL_OPTS:= --sw-sh-sym
endif
ifeq ($(CONFIG_INSTALL_SH_APPLET_HARDLINK),y)
INSTALL_OPTS:= --sw-sh-hard
endif
ifeq ($(CONFIG_INSTALL_SH_APPLET_SCRIPT_WRAPPER),y)
INSTALL_OPTS:= --scriptwrapper
endif
endif
ifeq ($(CONFIG_FEATURE_INDIVIDUAL),y)
INSTALL_OPTS:= --binaries
LIBBUSYBOX_SONAME:= 0_lib/libbusybox.so.$(BB_VER)
endif
install: $(srctree)/applets/install.sh busybox busybox.links
$(Q)DO_INSTALL_LIBS="$(strip $(LIBBUSYBOX_SONAME) $(DO_INSTALL_LIBS))" \
$(SHELL) $< $(CONFIG_PREFIX) $(INSTALL_OPTS)
ifeq ($(strip $(CONFIG_FEATURE_SUID)),y)
@echo
@echo
@echo --------------------------------------------------
@echo You will probably need to make your busybox binary
@echo setuid root to ensure all configured applets will
@echo work properly.
@echo --------------------------------------------------
@echo
endif
install-noclobber: INSTALL_OPTS+=--noclobber
install-noclobber: install
uninstall: busybox.links
rm -f $(CONFIG_PREFIX)/bin/busybox
for i in `cat busybox.links` ; do rm -f $(CONFIG_PREFIX)$$i; done
ifneq ($(strip $(DO_INSTALL_LIBS)),n)
for i in $(LIBBUSYBOX_SONAME) $(DO_INSTALL_LIBS); do \
rm -f $(CONFIG_PREFIX)$$i; \
done
endif
# Not very elegant: copies testsuite to objdir...
# (cp -pPR is POSIX-compliant (cp -dpR or cp -a would not be))
.PHONY: check
.PHONY: test
ifeq ($(CONFIG_UNIT_TEST),y)
UNIT_CMD = ./busybox unit
endif
check test: busybox busybox.links
$(UNIT_CMD)
test -d $(objtree)/testsuite || cp -pPR $(srctree)/testsuite $(objtree)
bindir=$(objtree) srcdir=$(srctree)/testsuite \
$(SHELL) -c "cd $(objtree)/testsuite && $(srctree)/testsuite/runtest $(if $(KBUILD_VERBOSE:0=),-v)"
.PHONY: release
release: distclean
cd ..; \
rm -r -f busybox-$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION); \
cp -pPR busybox busybox-$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION) && { \
find busybox-$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)/ -type d \
-name .svn \
-print \
-exec rm -r -f {} \; ; \
find busybox-$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)/ -type d \
-name .git \
-print \
-exec rm -r -f {} \; ; \
find busybox-$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)/ -type f \
-name .gitignore \
-print \
-exec rm -f {} \; ; \
find busybox-$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)/ -type f \
-name .\#* \
-print \
-exec rm -f {} \; ; \
tar -czf busybox-$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION).tar.gz \
busybox-$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)/ ; }
.PHONY: checkhelp
checkhelp:
$(Q)$(srctree)/scripts/checkhelp.awk \
$(patsubst %,$(srctree)/%,$(wildcard $(patsubst %,%/Config.in,$(busybox-dirs) ./)))
.PHONY: sizes
sizes: busybox_unstripped
$(NM) --size-sort $(<)
.PHONY: bloatcheck
bloatcheck: busybox_old busybox_unstripped
@$(srctree)/scripts/bloat-o-meter busybox_old busybox_unstripped
@$(CROSS_COMPILE)size busybox_old busybox_unstripped
.PHONY: baseline
baseline: busybox_unstripped
@mv busybox_unstripped busybox_old
.PHONY: objsizes
objsizes: busybox_unstripped
$(srctree)/scripts/objsizes
.PHONY: stksizes
stksizes: busybox_unstripped
$(CROSS_COMPILE)objdump -d busybox_unstripped | $(srctree)/scripts/checkstack.pl $(ARCH) | uniq
.PHONY: bigdata
bigdata: busybox_unstripped
$(CROSS_COMPILE)nm --size-sort busybox_unstripped | grep -vi ' [trw] '
# Documentation Targets
.PHONY: doc
doc: docs/busybox.pod docs/BusyBox.txt docs/busybox.1 docs/BusyBox.html
# FIXME: Doesn't belong here
cmd_doc =
quiet_cmd_doc = $(Q)echo " DOC $(@F)"
silent_cmd_doc =
disp_doc = $($(quiet)cmd_doc)
# sed adds newlines after "Options:" etc,
# this is needed in order to get good BusyBox.{1,txt,html}
docs/busybox.pod: $(srctree)/docs/busybox_header.pod \
include/usage.h \
$(srctree)/docs/busybox_footer.pod \
applets/usage_pod
$(disp_doc)
$(Q)-mkdir -p docs
$(Q)-( \
cat $(srctree)/docs/busybox_header.pod; \
echo; \
applets/usage_pod | sed 's/^[A-Za-z][A-Za-z ]*[a-z]:$$/&\n/'; \
cat $(srctree)/docs/busybox_footer.pod; \
) > docs/busybox.pod
docs/BusyBox.txt: docs/busybox.pod
$(disp_doc)
$(Q)-mkdir -p docs
$(Q)-pod2text $< > $@
docs/busybox.1: docs/busybox.pod
$(disp_doc)
$(Q)-mkdir -p docs
$(Q)-pod2man --center=busybox --release="version $(KERNELVERSION)" $< > $@
docs/BusyBox.html: docs/busybox.net/BusyBox.html
$(disp_doc)
$(Q)-mkdir -p docs
$(Q)-rm -f docs/BusyBox.html
$(Q)-cp docs/busybox.net/BusyBox.html docs/BusyBox.html
docs/busybox.net/BusyBox.html: docs/busybox.pod
$(Q)-mkdir -p docs/busybox.net
$(Q)-pod2html --noindex $< > $@
$(Q)-rm -f pod2htm*
# documentation, cross-reference
# Modern distributions already ship synopsis packages (e.g. debian)
# If you have an old distribution go to http://synopsis.fresco.org/
syn_tgt = $(wildcard $(patsubst %,%/*.c,$(busybox-alldirs)))
syn = $(patsubst %.c, %.syn, $(syn_tgt))
comma:= ,
brace_open:= (
brace_close:= )
SYN_CPPFLAGS := $(strip $(CPPFLAGS) $(EXTRA_CPPFLAGS))
SYN_CPPFLAGS := $(subst $(brace_open),\$(brace_open),$(SYN_CPPFLAGS))
SYN_CPPFLAGS := $(subst $(brace_close),\$(brace_close),$(SYN_CPPFLAGS))
#SYN_CPPFLAGS := $(subst ",\",$(SYN_CPPFLAGS))
#")
#SYN_CPPFLAGS := [$(patsubst %,'%'$(comma),$(SYN_CPPFLAGS))'']
%.syn: %.c
synopsis -p C -l Comments.SSDFilter,Comments.Previous -Wp,preprocess=True,cppflags="'$(SYN_CPPFLAGS)'" -o $@ $<
.PHONY: html
html: $(syn)
synopsis -f HTML -Wf,title="'BusyBox Documentation'" -o $@ $^
-include $(srctree)/Makefile.local

View File

@ -0,0 +1,223 @@
# ==========================================================================
# Build system
# ==========================================================================
BB_VER = $(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)
export BB_VER
SKIP_STRIP ?= n
# -std=gnu99 needed for [U]LLONG_MAX on some systems
CPPFLAGS += $(call cc-option,-std=gnu99,)
CPPFLAGS += \
-Iinclude -Ilibbb \
$(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include -I$(srctree)/libbb) \
-include include/autoconf.h \
-D_GNU_SOURCE -DNDEBUG \
$(if $(CONFIG_LFS),-D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64) \
$(if $(CONFIG_TIME64),-D_TIME_BITS=64) \
-DBB_VER=$(squote)$(quote)$(BB_VER)$(quote)$(squote)
CFLAGS += $(call cc-option,-Wall,)
CFLAGS += $(call cc-option,-Wshadow,)
CFLAGS += $(call cc-option,-Wwrite-strings,)
CFLAGS += $(call cc-option,-Wundef,)
CFLAGS += $(call cc-option,-Wstrict-prototypes,)
CFLAGS += $(call cc-option,-Wunused -Wunused-parameter,)
CFLAGS += $(call cc-option,-Wunused-function -Wunused-value,)
CFLAGS += $(call cc-option,-Wmissing-prototypes -Wmissing-declarations,)
CFLAGS += $(call cc-option,-Wno-format-security,)
# warn about C99 declaration after statement
CFLAGS += $(call cc-option,-Wdeclaration-after-statement,)
# If you want to add more -Wsomething above, make sure that it is
# still possible to build bbox without warnings.
ifeq ($(CONFIG_WERROR),y)
CFLAGS += $(call cc-option,-Werror,)
## TODO:
## gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) is a PITA:
## const char *ptr; ... off_t v = *(off_t*)ptr; -> BOOM
## and no easy way to convince it to shut the hell up.
## We have a lot of such things all over the place.
## Classic *(off_t*)(void*)ptr does not work,
## and I am unwilling to do crazy gcc specific ({ void *ppp = ...; })
## stuff in macros. This would obfuscate the code too much.
## Maybe try __attribute__((__may_alias__))?
#CFLAGS += $(call cc-ifversion, -eq, 0404, -fno-strict-aliasing)
endif
# gcc 3.x emits bogus "old style proto" warning on find.c:alloc_action()
CFLAGS += $(call cc-ifversion, -ge, 0400, -Wold-style-definition)
ifneq ($(lastword $(subst -, ,$(CC))),clang)
# "clang-9: warning: optimization flag '-finline-limit=0' is not supported
CFLAGS += $(call cc-option,-finline-limit=0,)
endif
CFLAGS += $(call cc-option,-fno-builtin-strlen -fomit-frame-pointer -ffunction-sections -fdata-sections,)
# -fno-guess-branch-probability: prohibit pseudo-random guessing
# of branch probabilities (hopefully makes bloatcheck more stable):
CFLAGS += $(call cc-option,-fno-guess-branch-probability,)
CFLAGS += $(call cc-option,-funsigned-char,)
ifeq ($(CONFIG_STATIC_LIBGCC),y)
# Disable it, for example, if you get
# "clang-9: warning: argument unused during compilation: '-static-libgcc'"
CFLAGS += $(call cc-option,-static-libgcc,)
endif
CFLAGS += $(call cc-option,-falign-functions=1,)
ifneq ($(lastword $(subst -, ,$(CC))),clang)
# "clang-9: warning: optimization flag '-falign-jumps=1' is not supported" (and same for other two)
CFLAGS += $(call cc-option,-falign-jumps=1 -falign-labels=1 -falign-loops=1,)
endif
# Defeat .eh_frame bloat (gcc 4.6.3 x86-32 defconfig: 20% smaller busybox binary):
CFLAGS += $(call cc-option,-fno-unwind-tables,)
CFLAGS += $(call cc-option,-fno-asynchronous-unwind-tables,)
# No automatic printf->puts,putchar conversions
# (try disabling this and comparing assembly, it's instructive)
CFLAGS += $(call cc-option,-fno-builtin-printf,)
# clang-9 does not like "str" + N and "if (CONFIG_ITEM && cond)" constructs
ifeq ($(lastword $(subst -, ,$(CC))),clang)
CFLAGS += $(call cc-option,-Wno-string-plus-int -Wno-constant-logical-operand)
endif
# FIXME: These warnings are at least partially to be concerned about and should
# be fixed..
#CFLAGS += $(call cc-option,-Wconversion,)
ifneq ($(CONFIG_DEBUG),y)
CFLAGS += $(call cc-option,-Oz,$(call cc-option,-Os,$(call cc-option,-O2,)))
else
CFLAGS += $(call cc-option,-g,)
#CFLAGS += "-D_FORTIFY_SOURCE=2"
ifeq ($(CONFIG_DEBUG_PESSIMIZE),y)
CFLAGS += $(call cc-option,-O0,)
else
CFLAGS += $(call cc-option,-Oz,$(call cc-option,-Os,$(call cc-option,-O2,)))
endif
endif
ifeq ($(CONFIG_DEBUG_SANITIZE),y)
CFLAGS += $(call cc-option,-fsanitize=address,)
CFLAGS += $(call cc-option,-fsanitize=leak,)
CFLAGS += $(call cc-option,-fsanitize=undefined,)
endif
# If arch/$(ARCH)/Makefile did not override it (with, say, -fPIC)...
ARCH_FPIC ?= -fpic
ARCH_FPIE ?= -fpie
ARCH_PIE ?= -pie
# Usage: $(eval $(call pkg_check_modules,VARIABLE-PREFIX,MODULES))
define pkg_check_modules
$(1)_CFLAGS := $(shell $(PKG_CONFIG) $(PKG_CONFIG_FLAGS) --cflags $(2))
$(1)_LIBS := $(shell $(PKG_CONFIG) $(PKG_CONFIG_FLAGS) --libs $(2))
endef
ifeq ($(CONFIG_BUILD_LIBBUSYBOX),y)
# on i386: 14% smaller libbusybox.so
# (code itself is 9% bigger, we save on relocs/PLT/GOT)
CFLAGS += $(ARCH_FPIC)
# and another 4% reduction of libbusybox.so:
# (external entry points must be marked EXTERNALLY_VISIBLE)
CFLAGS += $(call cc-option,-fvisibility=hidden)
endif
ifeq ($(CONFIG_STATIC),y)
CFLAGS_busybox += -static
PKG_CONFIG_FLAGS += --static
endif
ifeq ($(CONFIG_PIE),y)
CFLAGS_busybox += $(ARCH_PIE)
CFLAGS += $(ARCH_FPIE)
endif
ifneq ($(CONFIG_EXTRA_CFLAGS),)
CFLAGS += $(strip $(subst ",,$(CONFIG_EXTRA_CFLAGS)))
#"))
endif
# Note: both "" (string consisting of two quote chars) and empty string
# are possible, and should be skipped below.
ifneq ($(subst "",,$(CONFIG_SYSROOT)),)
CFLAGS += --sysroot=$(CONFIG_SYSROOT)
export SYSROOT=$(CONFIG_SYSROOT)
endif
# libm may be needed for dc, awk, ntpd
LDLIBS += m
# Android has no separate crypt library
# gcc-4.2.1 fails if we try to feed C source on stdin:
# echo 'int main(void){return 0;}' | $(CC) $(CFLAGS) -lcrypt -o /dev/null -xc -
# fall back to using a temp file:
CRYPT_AVAILABLE := $(shell echo 'int main(void){return 0;}' >bb_libtest.c; $(CC) $(CFLAGS) $(CFLAGS_busybox) -lcrypt -o /dev/null bb_libtest.c >/dev/null 2>&1 && echo "y"; rm bb_libtest.c)
RT_AVAILABLE := $(shell echo 'int main(void){return 0;}' >bb_libtest.c; $(CC) $(CFLAGS) $(CFLAGS_busybox) -lrt -o /dev/null bb_libtest.c >/dev/null 2>&1 && echo "y"; rm bb_libtest.c)
ifeq ($(CRYPT_AVAILABLE),y)
LDLIBS += crypt
endif
# librt may be needed for clock_gettime()
ifeq ($(RT_AVAILABLE),y)
LDLIBS += rt
endif
# libpam may use libpthread, libdl and/or libaudit.
# On some platforms that requires an explicit -lpthread, -ldl, -laudit.
# However, on *other platforms* it fails when some of those flags
# given needlessly. On some systems, crypt needs pthread.
#
# I even had a system where a runtime test for pthread
# (similar to CRYPT_AVAILABLE test above) was not reliable.
#
# Do not propagate this mess by adding libraries to CONFIG_PAM/CRYPT_AVAILABLE blocks.
# Add libraries you need to CONFIG_EXTRA_LDLIBS instead.
ifeq ($(CONFIG_PAM),y)
LDLIBS += pam pam_misc
endif
ifeq ($(CONFIG_SELINUX),y)
SELINUX_PC_MODULES = libselinux libsepol
$(eval $(call pkg_check_modules,SELINUX,$(SELINUX_PC_MODULES)))
CPPFLAGS += $(SELINUX_CFLAGS)
LDLIBS += $(if $(SELINUX_LIBS),$(SELINUX_LIBS:-l%=%),$(SELINUX_PC_MODULES:lib%=%))
endif
ifeq ($(CONFIG_FEATURE_NSLOOKUP_BIG),y)
ifneq (,$(findstring linux,$(shell $(CC) $(CFLAGS) -dumpmachine)))
LDLIBS += resolv
endif
ifneq (,$(findstring gnu,$(shell $(CC) $(CFLAGS) -dumpmachine)))
LDLIBS += resolv
endif
endif
ifeq ($(CONFIG_EFENCE),y)
LDLIBS += efence
endif
ifeq ($(CONFIG_DMALLOC),y)
LDLIBS += dmalloc
endif
# If a flat binary should be built, CFLAGS_busybox="-elf2flt"
# env var should be set for make invocation.
# Here we check whether CFLAGS_busybox indeed contains that flag.
# (For historical reasons, we also check LDFLAGS, which doesn't
# seem to be entirely correct variable to put "-elf2flt" into).
W_ELF2FLT = -elf2flt
ifneq (,$(findstring $(W_ELF2FLT),$(LDFLAGS) $(CFLAGS_busybox)))
SKIP_STRIP = y
endif
ifneq ($(CONFIG_EXTRA_LDFLAGS),)
LDFLAGS += $(strip $(subst ",,$(CONFIG_EXTRA_LDFLAGS)))
#"))
endif
# Busybox is a stack-fatty so make sure we increase default size
# TODO: use "make stksizes" to find & fix big stack users
# (we stole scripts/checkstack.pl from the kernel... thanks guys!)
# Reduced from 20k to 16k in 1.9.0.
FLTFLAGS += -s 16000

View File

@ -0,0 +1,44 @@
# ==========================================================================
# Build system
# ==========================================================================
help:
@echo 'Cleaning:'
@echo ' clean - delete temporary files created by build'
@echo ' distclean - delete all non-source files (including .config)'
@echo ' doc-clean - delete all generated documentation'
@echo
@echo 'Build:'
@echo ' all - Executable and documentation'
@echo ' busybox - the swiss-army executable'
@echo ' doc - docs/BusyBox.{txt,html,1}'
@echo ' html - create html-based cross-reference'
@echo
@echo 'Configuration:'
@echo ' allnoconfig - disable all symbols in .config'
@echo ' allyesconfig - enable all symbols in .config (see defconfig)'
@echo ' config - text based configurator (of last resort)'
@echo ' defconfig - set .config to largest generic configuration'
@echo ' menuconfig - interactive curses-based configurator'
@echo ' oldconfig - resolve any unresolved symbols in .config'
@$(if $(boards), \
$(foreach b, $(boards), \
printf " %-21s - Build for %s\\n" $(b) $(subst _defconfig,,$(b));) \
echo '')
@echo
@echo 'Installation:'
@echo ' install - install busybox into CONFIG_PREFIX'
@echo ' uninstall'
@echo
@echo 'Development:'
@echo ' baseline - create busybox_old for bloatcheck.'
@echo ' bloatcheck - show size difference between old and new versions'
@echo ' check - run the test suite for all applets'
@echo ' checkhelp - check for missing help-entries in Config.in'
@echo ' randconfig - generate a random configuration'
@echo ' release - create a distribution tarball'
@echo ' sizes - show size of all enabled busybox symbols'
@echo ' objsizes - show size of each .o object built'
@echo ' bigdata - show data objects, biggest first'
@echo ' stksizes - show stack users, biggest first'
@echo

View File

@ -0,0 +1,437 @@
Why an applet can't be NOFORK or NOEXEC?
Why can't be NOFORK:
interactive: may wait for user input, ^C has to work
spawner: "tool PROG ARGS" which changes program state and execs - must fork
changes state: e.g. environment, signal handlers
leaks: does not free allocated memory or opened fds
alloc+xfunc: xmalloc, then xfunc - leaks memory if xfunc dies
open+xfunc: opens fd, then calls xfunc - fd is leaked if xfunc dies
talks to network/serial/etc: it's not known how long the delay can be,
it's reasonable to expect it might be many seconds
(even if usually it is not), so ^C has to work
runner: sometimes may run for long(ish) time, and/or works with network:
^C has to work (cat BIGFILE, chmod -R, ftpget, nc)
"runners" can become eligible after shell is taught ^C to interrupt NOFORKs,
need to be inspected that they do not fall into alloc+xfunc, open+xfunc,
leak categories.
Why can't be NOEXEC:
suid: runs under different uid - must fork+exec
if it's important that /proc/PID/cmdline and comm are correct.
("pkill sh" killing itself before it kills real "sh" is no fun)
Why shouldn't be NOFORK/NOEXEC:
rare: not started often enough to bother optimizing (example: poweroff)
daemon: runs indefinitely; these are also always fit "rare" category
longterm: often runs for a long time (many seconds), execing makes
memory footprint smaller
complex: no immediately obvious reason why NOFORK wouldn't work,
but does some non-obvoius operations (example: fuser, lsof, losetup);
detailed audit often turns out that it's a leaker
hardware: performs unusual hardware ops which may take long,
or even hang due to hardware or firmware bugs
Interesting example of "interactive" applet which is nevertheless can be
(and is) NOEXEC is "rm". Yes, "rm -i" is interactive - but it's not that typical
for users to keep it waiting for many minutes, whereas running "rm" in shell
is very typical, and speeding up this common use via NOEXEC is useful.
IOW: rm is "interactive", but not "longterm".
Interesting example of an applet which can be NOFORK but if not,
then should not be NOEXEC, is "usleep". As NOFORK, it amount to simply
nanosleep()ing in the calling program (usually shell). No memory wasted.
But if ran as NOEXEC, it would create a potentially long-term process,
which would be taking more memory because it did not exec
and did not free much of the copied memory of the parent
(COW helps with this only as long as parent doesn't modify its memory).
[ - NOFORK
[[ - NOFORK
acpid - daemon
add-shell - noexec. leaks: open+xfunc
addgroup - noexec. leaks
adduser - noexec. leaks
adjtimex - NOFORK
ar - runner
arch - NOFORK
arp - talks to network: arp -n queries DNS
arping - longterm
ash - interactive, longterm
awk - noexec. runner
base64 - runner
basename - NOFORK
beep - longterm: beep -r 999999999
blkdiscard - noexec. leaks: open+xioctl
blkid - noexec
blockdev - noexec. leaks fd
bootchartd - daemon
brctl - noexec
bunzip2 - runner
bzcat - runner
bzip2 - runner
cal - noexec. can be runner: cal -n9999
cat - runner: cat HUGEFILE
chat - longterm (when used as intended - talking to modem over stdin/out)
chattr - noexec. runner
chgrp - noexec. runner
chmod - noexec. runner
chown - noexec. runner
chpasswd - longterm? (list of "user:password"s from stdin)
chpst - noexec. spawner
chroot - noexec. spawner
chrt - noexec. spawner
chvt - noexec. leaks: get_console_fd_or_die() may open a new fd, or return one of stdio fds
cksum - noexec. runner
clear - NOFORK
cmp - runner
comm - runner
conspy - interactive, longterm
cp - noexec. sometimes runner
cpio - runner
crond - daemon
crontab - longterm (runs $EDITOR), leaks: open+xasprintf
cryptpw - noexec. changes state: with --password-fd=N, moves N to stdin
cttyhack - noexec. spawner
cut - noexec. runner
date - noexec. nofork candidate(needs to stop messing up env, free xasprintf result, not use xfuncs after xasprintf)
dc - longterm (eats stdin if no params)
dd - noexec. runner
deallocvt - noexec. leaks: get_console_fd_or_die() may open a new fd, or return one of stdio fds
delgroup - noexec. leaks
deluser - noexec. leaks
depmod - longterm(ish)
devmem - hardware (access to device memory may hang)
df - noexec. leaks: nested allocs
dhcprelay - daemon
diff - runner
dirname - NOFORK
dmesg - runner
dnsd - daemon
dnsdomainname - noexec. talks to network (may query DNS)
dos2unix - noexec. runner
dpkg - runner
du - runner
dumpkmap - noexec. leaks: get_console_fd_or_die() may open a new fd, or return one of stdio fds
dumpleases - noexec. leaks: open+xread
echo - NOFORK
ed - interactive, longterm
egrep - longterm runner ("CMD | egrep ..." may run indefinitely, better to exec to conserve memory)
eject - hardware, leaks: open+ioctl_or_perror_and_die, changes state (moves fds)
env - noexec. spawner, changes state (env)
envdir - noexec. spawner
envuidgid - noexec. spawner
expand - runner
expr - noexec. leaks: nested allocs
factor - longterm (eats stdin if no params)
fakeidentd - daemon
false - NOFORK
fatattr - noexec. leaks: open+xioctl, complex
fbset - hardware, leaks: open+xfunc
fbsplash - runner, longterm
fdflush - hardware, leaks: open+ioctl_or_perror_and_die
fdformat - hardware, longterm
fdisk - interactive, longterm
fgconsole - noexec. leaks: get_console_fd_or_die() may open a new fd, or return one of stdio fds
fgrep - longterm runner ("CMD | fgrep ..." may run indefinitely, better to exec to conserve memory)
find - noexec. runner
findfs - suid
flash_eraseall - hardware
flash_lock - hardware
flash_unlock - hardware
flashcp - hardware
flock - spawner, changes state (file locks), let's play safe and not be noexec
fold - noexec. runner
free - NOFORK
freeramdisk - noexec. leaks: open+ioctl_or_perror_and_die
fsck - interactive, longterm
fsck.minix - needs ^C
fsfreeze - noexec. leaks: open+xioctl
fstrim - noexec. leaks: open+xioctl, find_block_device -> readdir+xstrdup
fsync - NOFORK
ftpd - daemon
ftpget - runner
ftpput - runner
fuser - complex
getopt - noexec. leaks: many allocs
getty - interactive, longterm
grep - longterm runner ("CMD | grep ..." may run indefinitely, better to exec to conserve memory)
groups - noexec
gunzip - runner
gzip - runner
halt - rare
hd - noexec. runner
hdparm - hardware
head - noexec. runner
hexdump - noexec. runner
hexedit - interactive, longterm
hostid - NOFORK
hostname - noexec. talks to network (hostname -d may query DNS)
httpd - daemon
hush - interactive, longterm
hwclock - hardware (xioctl(RTC_RD_TIME))
i2cdetect - hardware
i2cdump - hardware
i2cget - hardware
i2cset - hardware
id - noexec
ifconfig - hardware? (mem_start NN io_addr NN irq NN), leaks: xsocket+ioctl_or_perror_and_die
ifenslave - noexec. leaks: xsocket+bb_perror_msg_and_die
ifplugd - daemon
inetd - daemon
init - daemon
inotifyd - daemon
insmod - noexec
install - runner
ionice - noexec. spawner
iostat - longterm: "iostat 1" runs indefinitely
ip - noexec
ipaddr - noexec
ipcalc - noexec. ipcalc -h talks to network
ipcrm - noexec
ipcs - noexec
iplink - noexec
ipneigh - noexec
iproute - noexec
iprule - noexec
iptunnel - noexec
kbd_mode - noexec. leaks: xopen_nonblocking+xioctl
kill - NOFORK
killall - NOFORK
killall5 - NOFORK
klogd - daemon
last - runner (I've got 1300 lines of output when tried it)
less - interactive, longterm
link - NOFORK
linux32 - noexec. spawner
linux64 - noexec. spawner
linuxrc - daemon
ln - noexec
loadfont - noexec. leaks: config_open+bb_error_msg_and_die("map format")
loadkmap - noexec. leaks: get_console_fd_or_die() may open a new fd, or return one of stdio fds
logger - runner
login - suid, interactive, longterm
logname - NOFORK
losetup - noexec. complex
lpd - daemon
lpq - runner
lpr - runner
ls - noexec. runner
lsattr - noexec. runner
lsmod - noexec
lsof - complex
lspci - noexec. too rare to bother for nofork
lsscsi - noexec. too rare to bother for nofork
lsusb - noexec. too rare to bother for nofork
lzcat - runner
lzma - runner
lzop - runner
lzopcat - runner
makedevs - noexec
makemime - runner
man - spawner, interactive, longterm
md5sum - noexec. runner
mdev - daemon
mesg - NOFORK
microcom - interactive, longterm
minips - noexec
mkdir - NOFORK
mkdosfs - needs ^C
mke2fs - needs ^C
mkfifo - noexec
mkfs.ext2 - needs ^C
mkfs.minix - needs ^C
mkfs.vfat - needs ^C
mknod - noexec
mkpasswd - noexec. changes state: with --password-fd=N, moves N to stdin
mkswap - needs ^C
mktemp - noexec. leaks: xstrdup+concat_path_file
modinfo - noexec
modprobe - noexec
more - interactive, longterm
mount - suid
mountpoint - noexec. leaks: option -n "print dev name": find_block_device -> readdir+xstrdup
mpstat - longterm: "mpstat 1" runs indefinitely
mt - hardware
mv - noexec. sometimes runner
nameif - noexec. openlog(), leaks: config_open2+ioctl_or_perror_and_die
nbd-client - noexec
nc - runner
netstat - longterm with -c (continuous listing)
nice - noexec. spawner
nl - runner
nmeter - longterm
nohup - noexec. spawner
nproc - NOFORK
ntpd - daemon
nuke - noexec
od - runner
openvt - longterm: spawns a child and waits for it
partprobe - noexec. leaks: open+ioctl_or_perror_and_die(BLKRRPART)
passwd - suid
paste - noexec. runner
patch - needs ^C
pgrep - must fork+exec to get correct /proc/PID/cmdline and comm field
pidof - must fork+exec to get correct /proc/PID/cmdline and comm field
ping - suid, longterm
ping6 - suid, longterm
pipe_progress - longterm
pivot_root - NOFORK
pkill - must fork+exec to get correct /proc/PID/cmdline and comm field
pmap - noexec candidate, leaks: open+xstrdup
popmaildir - runner
poweroff - rare
powertop - interactive, longterm
printenv - NOFORK
printf - NOFORK
ps - noexec
pscan - talks to network
pstree - noexec
pwd - NOFORK
pwdx - NOFORK
raidautorun - noexec. very simple. leaks: open+xioctl
rdate - talks to network
rdev - noexec. leaks: find_block_device -> readdir+xstrdup
readlink - NOFORK
readprofile - reads /boot/System.map and /proc/profile, better to free more memory by execing?
realpath - NOFORK
reboot - rare
reformime - runner
remove-shell - noexec. leaks: open+xfunc
renice - noexec. nofork candidate(uses getpwnam, is that ok?)
reset - noexec. spawner (execs "stty")
resize - noexec. changes state (signal handlers)
resume - noexec
rev - runner
rm - noexec. rm -i interactive
rmdir - NOFORK
rmmod - noexec
route - talks to network (may query DNS to convert IPs to names)
rpm - runner
rpm2cpio - runner
rtcwake - longterm: puts system to sleep, optimizing this for speed is pointless
run-init - spawner, rare, changes state (oh yes), execing may be important to free binary's inode
run-parts - longterm
runlevel - noexec. can be nofork if "endutxent()" is called unconditionally, but too rare to bother?
runsv - daemon
runsvdir - daemon
rx - runner
script - longterm: pumps script output from slave pty
scriptreplay - longterm: plays back "script" saved output, sleeping as necessary.
sed - runner
sendmail - runner
seq - noexec. runner
setarch - noexec. spawner
setconsole - noexec
setfattr - noexec
setfont - noexec. leaks a lot of stuff
setkeycodes - noexec
setlogcons - noexec
setpriv - spawner, changes state, let's play safe and not be noexec
setserial - noexec
setsid - spawner, uses fork_or_rexec() [not audited to work in noexec], let's play safe and not be noexec
setuidgid - noexec. spawner
sha1sum - noexec. runner
sha256sum - noexec. runner
sha3sum - noexec. runner
sha512sum - noexec. runner
showkey - interactive, longterm
shred - runner
shuf - noexec. runner
slattach - longterm (may sleep forever), uses bb_common_bufsiz1
sleep - longterm. Could be nofork, if not the problem of "killall sleep" not killing it.
smemcap - runner
softlimit - noexec. spawner
sort - noexec. runner
split - runner
ssl_client - longterm
start-stop-daemon - not noexec: uses bb_common_bufsiz1
stat - noexec. nofork candidate(needs fewer allocs)
strings - runner
stty - noexec. nofork candidate: has no allocs or opens except xmove_fd(xopen("-F DEVICE"),STDIN). tcsetattr(STDIN) is not a problem: it would work the same across processes sharing this fd
su - suid, spawner
sulogin - noexec. spawner
sum - runner
sv - noexec. needs ^C (uses usleep(420000))
svc - noexec. needs ^C (uses usleep(420000))
svlogd - daemon
swapoff - longterm: may cause memory pressure, execing is beneficial
swapon - rare
switch_root - spawner, rare, changes state (oh yes), execing may be important to free binary's inode
sync - NOFORK
sysctl - noexec. leaks: xstrdup+xmalloc_read
syslogd - daemon
tac - noexec. runner
tail - runner
tar - runner
taskset - noexec. spawner
tcpsvd - daemon
tee - runner
telnet - interactive, longterm
telnetd - daemon
test - NOFORK
tftp - runner
tftpd - daemon
time - spawner, longterm, changes state (signals)
timeout - spawner, longterm, changes state (signals)
top - interactive, longterm
touch - NOFORK
tr - runner
traceroute - suid, longterm
traceroute6 - suid, longterm
true - NOFORK
truncate - NOFORK
tty - NOFORK
ttysize - NOFORK
tunctl - noexec
tune2fs - noexec. leaks: open+xfunc
ubiattach - hardware
ubidetach - hardware
ubimkvol - hardware
ubirename - hardware
ubirmvol - hardware
ubirsvol - hardware
ubiupdatevol - hardware
udhcpc - daemon
udhcpd - daemon
udpsvd - daemon
uevent - daemon
umount - noexec. leaks: nested xmalloc
uname - NOFORK
uncompress - runner
unexpand - runner
uniq - runner
unix2dos - noexec. runner
unlink - NOFORK
unlzma - runner
unlzop - runner
unxz - runner
unzip - runner
uptime - noexec. nofork candidate(is getutxent ok?)
users - noexec. nofork candidate(is getutxent ok?)
usleep - NOFORK. But what about "killall usleep"?
uudecode - runner
uuencode - runner
vconfig - noexec. leaks: xsocket+ioctl_or_perror_and_die
vi - interactive, longterm
vlock - suid
volname - hardware (reads CDROM, this can take long-ish if need to spin up)
w - noexec. nofork candidate(is getutxent ok?)
wall - suid
watch - longterm
watchdog - daemon
wc - runner
wget - longterm
which - NOFORK
who - noexec. nofork candidate(is getutxent ok?)
whoami - NOFORK
whois - talks to network
xargs - noexec. spawner
xxd - noexec. runner
xz - runner
xzcat - runner
yes - noexec. runner
zcat - runner
zcip - daemon

34
busybox-1_37_0/NOFORK_NOEXEC.sh Executable file
View File

@ -0,0 +1,34 @@
#!/bin/sh
exec >NOFORK_NOEXEC.lst1
false && grep -Fv 'NOFORK' NOFORK_NOEXEC.lst \
| grep -v 'noexec.' | grep -v 'noexec$' \
| grep -v ' suid' \
| grep -v ' daemon' \
| grep -v ' longterm' \
| grep rare
echo === nofork candidate
grep -F 'nofork candidate' NOFORK_NOEXEC.lst \
echo === noexec candidate
grep -F 'noexec candidate' NOFORK_NOEXEC.lst \
echo === ^C
grep -F '^C' NOFORK_NOEXEC.lst \
| grep -F ' - ' \
echo === talks
grep -F 'talks' NOFORK_NOEXEC.lst \
| grep -F ' - ' \
echo ===
grep -Fv 'NOFORK' NOFORK_NOEXEC.lst \
| grep '^[^ ][^ ]* - ' \
| grep -v 'noexec.' | grep -v ' - noexec$' \
| grep -v ' suid' \
| grep -v ' daemon' \
| grep -v 'longterm' \
| grep -v 'interactive' \
| grep -v 'hardware' \

204
busybox-1_37_0/README Normal file
View File

@ -0,0 +1,204 @@
Please see the LICENSE file for details on copying and usage.
Please refer to the INSTALL file for instructions on how to build.
What is busybox:
BusyBox combines tiny versions of many common UNIX utilities into a single
small executable. It provides minimalist replacements for most of the
utilities you usually find in bzip2, coreutils, dhcp, diffutils, e2fsprogs,
file, findutils, gawk, grep, inetutils, less, modutils, net-tools, procps,
sed, shadow, sysklogd, sysvinit, tar, util-linux, and vim. The utilities
in BusyBox often have fewer options than their full-featured cousins;
however, the options that are included provide the expected functionality
and behave very much like their larger counterparts.
BusyBox has been written with size-optimization and limited resources in
mind, both to produce small binaries and to reduce run-time memory usage.
Busybox is also extremely modular so you can easily include or exclude
commands (or features) at compile time. This makes it easy to customize
embedded systems; to create a working system, just add /dev, /etc, and a
Linux kernel. Busybox (usually together with uClibc) has also been used as
a component of "thin client" desktop systems, live-CD distributions, rescue
disks, installers, and so on.
BusyBox provides a fairly complete POSIX environment for any small system,
both embedded environments and more full featured systems concerned about
space. Busybox is slowly working towards implementing the full Single Unix
Specification V3 (http://www.opengroup.org/onlinepubs/009695399/), but isn't
there yet (and for size reasons will probably support at most UTF-8 for
internationalization). We are also interested in passing the Linux Test
Project (http://ltp.sourceforge.net).
----------------
Using busybox:
BusyBox is extremely configurable. This allows you to include only the
components and options you need, thereby reducing binary size. Run 'make
config' or 'make menuconfig' to select the functionality that you wish to
enable. (See 'make help' for more commands.)
The behavior of busybox is determined by the name it's called under: as
"cp" it behaves like cp, as "sed" it behaves like sed, and so on. Called
as "busybox" it takes the second argument as the name of the applet to
run (I.E. "./busybox ls -l /proc").
The "standalone shell" mode is an easy way to try out busybox; this is a
command shell that calls the built-in applets without needing them to be
installed in the path. (Note that this requires /proc to be mounted, if
testing from a boot floppy or in a chroot environment.)
The build automatically generates a file "busybox.links", which is used by
'make install' to create symlinks to the BusyBox binary for all compiled in
commands. This uses the CONFIG_PREFIX environment variable to specify
where to install, and installs hardlinks or symlinks depending
on the configuration preferences. (You can also manually run
the install script at "applets/install.sh").
----------------
Downloading the current source code:
Source for the latest released version, as well as daily snapshots, can always
be downloaded from
http://busybox.net/downloads/
You can browse the up to the minute source code and change history online.
http://git.busybox.net/busybox/
Anonymous GIT access is available. For instructions, check out:
http://www.busybox.net/source.html
For those that are actively contributing and would like to check files in,
see:
http://busybox.net/developer.html
The developers also have a bug and patch tracking system
(https://bugs.busybox.net) although posting a bug/patch to the mailing list
is generally a faster way of getting it fixed, and the complete archive of
what happened is the git changelog.
Note: if you want to compile busybox in a busybox environment you must
select CONFIG_DESKTOP.
----------------
Getting help:
when you find you need help, you can check out the busybox mailing list
archives at http://busybox.net/lists/busybox/ or even join
the mailing list if you are interested.
----------------
Bugs:
if you find bugs, please submit a detailed bug report to the busybox mailing
list at busybox@busybox.net. a well-written bug report should include a
transcript of a shell session that demonstrates the bad behavior and enables
anyone else to duplicate the bug on their own machine. the following is such
an example:
to: busybox@busybox.net
from: diligent@testing.linux.org
subject: /bin/date doesn't work
package: busybox
version: 1.00
when i execute busybox 'date' it produces unexpected results.
with gnu date i get the following output:
$ date
fri oct 8 14:19:41 mdt 2004
but when i use busybox date i get this instead:
$ date
illegal instruction
i am using debian unstable, kernel version 2.4.25-vrs2 on a netwinder,
and the latest uclibc from cvs.
-diligent
note the careful description and use of examples showing not only what
busybox does, but also a counter example showing what an equivalent app
does (or pointing to the text of a relevant standard). Bug reports lacking
such detail may never be fixed... Thanks for understanding.
----------------
Portability:
Busybox is developed and tested on Linux 2.4 and 2.6 kernels, compiled
with gcc (the unit-at-a-time optimizations in version 3.4 and later are
worth upgrading to get, but older versions should work), and linked against
uClibc (0.9.27 or greater) or glibc (2.2 or greater). In such an
environment, the full set of busybox features should work, and if
anything doesn't we want to know about it so we can fix it.
There are many other environments out there, in which busybox may build
and run just fine. We just don't test them. Since busybox consists of a
large number of more or less independent applets, portability is a question
of which features work where. Some busybox applets (such as cat and rm) are
highly portable and likely to work just about anywhere, while others (such as
insmod and losetup) require recent Linux kernels with recent C libraries.
Earlier versions of Linux and glibc may or may not work, for any given
configuration. Linux 2.2 or earlier should mostly work (there's still
some support code in things like mount.c) but this is no longer regularly
tested, and inherently won't support certain features (such as long files
and --bind mounts). The same is true for glibc 2.0 and 2.1: expect a higher
testing and debugging burden using such old infrastructure. (The busybox
developers are not very interested in supporting these older versions, but
will probably accept small self-contained patches to fix simple problems.)
Some environments are not recommended. Early versions of uClibc were buggy
and missing many features: upgrade. Linking against libc5 or dietlibc is
not supported and not interesting to the busybox developers. (The first is
obsolete and has no known size or feature advantages over uClibc, the second
has known bugs that its developers have actively refused to fix.) Ancient
Linux kernels (2.0.x and earlier) are similarly uninteresting.
In theory it's possible to use Busybox under other operating systems (such as
MacOS X, Solaris, Cygwin, or the BSD Fork Du Jour). This generally involves
a different kernel and a different C library at the same time. While it
should be possible to port the majority of the code to work in one of
these environments, don't be surprised if it doesn't work out of the box. If
you're into that sort of thing, start small (selecting just a few applets)
and work your way up.
In 2005 Shaun Jackman has ported busybox to a combination of newlib
and libgloss, and some of his patches have been integrated.
Supported hardware:
BusyBox in general will build on any architecture supported by gcc. We
support both 32 and 64 bit platforms, and both big and little endian
systems.
Under 2.4 Linux kernels, kernel module loading was implemented in a
platform-specific manner. Busybox's insmod utility has been reported to
work under ARM, CRIS, H8/300, x86, ia64, x86_64, m68k, MIPS, PowerPC, S390,
SH3/4/5, Sparc, and v850e. Anything else probably won't work.
The module loading mechanism for the 2.6 kernel is much more generic, and
we believe 2.6.x kernel module loading support should work on all
architectures supported by the kernel.
----------------
Please feed suggestions, bug reports, insults, and bribes back to the busybox
mailing list:
busybox@busybox.net
and/or maintainer:
Denys Vlasenko
<vda.linux@googlemail.com>

256
busybox-1_37_0/TODO Normal file
View File

@ -0,0 +1,256 @@
Busybox TODO
Harvest patches from
http://git.openembedded.org/cgit.cgi/openembedded/tree/recipes/busybox/
https://dev.openwrt.org/browser/trunk/package/busybox/patches/
Stuff that needs to be done. This is organized by who plans to get around to
doing it eventually, but that doesn't mean they "own" the item. If you want to
do one of these bounce an email off the person it's listed under to see if they
have any suggestions how they plan to go about it, and to minimize conflicts
between your work and theirs. But otherwise, all of these are fair game.
Rob Landley suggested this:
Implement bb_realpath() that can handle NULL on non-glibc.
sh
The command shell situation is a mess. We have two different
shells that don't really share any code, and the "standalone shell" doesn't
work all that well (especially not in a chroot environment), due to apps not
being reentrant.
Do a SUSv3 audit
Look at the full Single Unix Specification version 3 (available online at
"http://www.opengroup.org/onlinepubs/009695399/nfindex.html") and
figure out which of our apps are compliant, and what we're missing that
we might actually care about.
Even better would be some kind of automated compliance test harness that
exercises each command line option and the various corner cases.
Internationalization
How much internationalization should we do?
The low hanging fruit is UTF-8 character set support. We should do this.
See TODO_unicode file.
We also have lots of hardwired english text messages. Consolidating this
into some kind of message table not only makes translation easier, but
also allows us to consolidate redundant (or close) strings.
We probably don't want to be bloated with locale support. (Not unless we
can cleanly export it from our underlying C library without having to
concern ourselves with it directly. Perhaps a few specific things like a
config option for "date" are low hanging fruit here?)
What level should things happen at? How much do we care about
internationalizing the text console when X11 and xterms are so much better
at it? (There's some infrastructure here we don't implement: The
"unicode_start" and "unicode_stop" shell scripts need "vt-is-UTF8" and a
--unicode option to loadkeys. That implies a real loadkeys/dumpkeys
implementation to replace loadkmap/dumpkmap. Plus messing with console font
loading. Is it worth it, or do we just say "use X"?)
Individual compilation of applets.
It would be nice if busybox had the option to compile to individual applets,
for people who want an alternate implementation less bloated than the gnu
utils (or simply with less political baggage), but without it being one big
executable.
Turning libbb into a real dll is another possibility, especially if libbb
could export some of the other library interfaces we've already more or less
got the code for (like zlib).
buildroot - Make a "dogfood" option
Busybox 1.1 will be capable of replacing most gnu packages for real world
use, such as developing software or in a live CD. It needs wider testing.
Busybox should now be able to replace bzip2, coreutils, e2fsprogs, file,
findutils, gawk, grep, inetutils, less, modutils, net-tools, patch, procps,
sed, shadow, sysklogd, sysvinit, tar, util-linux, and vim. The resulting
system should be self-hosting (I.E. able to rebuild itself from source
code). This means it would need (at least) binutils, gcc, and make, or
equivalents.
It would be a good "eating our own dogfood" test if buildroot had the option
of using a "make allyesconfig" busybox instead of the all of the above
packages. Anything that's wrong with the resulting system, we can fix. (It
would be nice to be able to upgrade busybox to be able to replace bash and
diffutils as well, but we're not there yet.)
One example of an existing system that does this already is Firmware Linux:
http://www.landley.net/code/firmware
initramfs
Busybox should have a sample initramfs build script. This depends on
shell, mdev, and switch_root.
mkdep
Write a mkdep that doesn't segfault if there's a directory it doesn't
have permission to read, isn't based on manually editing the output of
lexx and yacc, doesn't make such a mess under include/config, etc.
Group globals into unions of structures.
Go through and turn all the global and static variables into structures,
and have all those structures be in a big union shared between processes,
so busybox uses less bss. (This is a big win on nommu machines.) See
sed.c and mdev.c for examples.
Go through bugs.busybox.net and close out all of that somehow.
This one's open to everybody, but I'll wind up doing it...
Bernhard Reutner-Fischer <busybox@busybox.net> suggests to look at these:
New debug options:
-Wlarger-than-127
Cleanup any big users
Collate BUFSIZ IOBUF_SIZE MY_BUF_SIZE PIPE_PROGRESS_SIZE BUFSIZE PIPESIZE
make bb_common_bufsiz1 configurable, size wise.
make pipesize configurable, size wise.
Use bb_common_bufsiz1 throughout applets!
As yet unclaimed:
----
diff
Make sure we handle empty files properly:
From the patch man page:
you can remove a file by sending out a context diff that compares
the file to be deleted with an empty file dated the Epoch. The
file will be removed unless patch is conforming to POSIX and the
-E or --remove-empty-files option is not given.
---
patch
Should have simple fuzz factor support to apply patches at an offset which
shouldn't take up too much space.
And while we're at it, a new patch filename quoting format is apparently
coming soon: http://marc.theaimsgroup.com/?l=git&m=112927316408690&w=2
Architectural issues:
bb_close() with fsync()
We should have a bb_close() in place of normal close, with a CONFIG_ option
to not just check the return value of close() for an error, but fsync().
Close can't reliably report anything useful because if write() accepted the
data then it either went out to the network or it's in cache or a pipe
buffer. Either way, there's no guarantee it'll make it to its final
destination before close() gets called, so there's no guarantee that any
error will be reported.
You need to call fsync() if you care about errors that occur after write(),
but that can have a big performance impact. So make it a config option.
---
Unify archivers
Lots of archivers have the same general infrastructure. The directory
traversal code should be factored out, and the guts of each archiver could
be some setup code and a series of callbacks for "add this file",
"add this directory", "add this symlink" and so on.
This could clean up tar and zip, and make it cheaper to add cpio and ar
write support, and possibly even cheaply add things like mkisofs or
mksquashfs someday, if they become relevant.
---
Text buffer support.
Several existing applets (sort, vi, less...) read
a whole file into memory and act on it. Use open_read_close().
---
Memory Allocation
We have a CONFIG_BUFFER mechanism that lets us select whether to do memory
allocation on the stack or the heap. Unfortunately, we're not using it much.
We need to audit our memory allocations and turn a lot of malloc/free calls
into RESERVE_CONFIG_BUFFER/RELEASE_CONFIG_BUFFER.
For a start, see e.g. make EXTRA_CFLAGS=-Wlarger-than-64
And while we're at it, many of the CONFIG_FEATURE_CLEAN_UP #ifdefs will be
optimized out by the compiler in the stack allocation case (since there's no
free for an alloca()), and this means that various cleanup loops that just
call free might also be optimized out by the compiler if written right, so
we can yank those #ifdefs too, and generally clean up the code.
---
FEATURE_CLEAN_UP
This is more an unresolved issue than a to-do item. More thought is needed.
Normally we rely on exit() to free memory, close files and unmap segments
for us. This makes most calls to free(), close(), and unmap() optional in
busybox applets that don't intend to run for very long, and optional stuff
can be omitted to save size.
The idea was raised that we could simulate fork/exit with setjmp/longjmp
for _really_ brainless embedded systems, or speed up the standalone shell
by not forking. Doing so would require a reliable FEATURE_CLEAN_UP.
Unfortunately, this isn't as easy as it sounds.
The problem is, lots of things exit(), sometimes unexpectedly (xmalloc())
and sometimes reliably (bb_perror_msg_and_die() or show_usage()). This
jumps out of the normal flow control and bypasses any cleanup code we
put at the end of our applets.
It's possible to add hooks to libbb functions like xmalloc() and xopen()
to add their entries to a linked list, which could be traversed and
freed/closed automatically. (This would need to be able to free just the
entries after a checkpoint to be usable for a forkless standalone shell.
You don't want to free the shell's own resources.)
Right now, FEATURE_CLEAN_UP is more or less a debugging aid, to make things
like valgrind happy. It's also documentation of _what_ we're trusting
exit() to clean up for us. But new infrastructure to auto-free stuff would
render the existing FEATURE_CLEAN_UP code redundant.
For right now, exit() handles it just fine.
Minor stuff:
watchdog.c could autodetect the timer duration via:
if(!ioctl (fd, WDIOC_GETTIMEOUT, &tmo)) timer_duration = 1 + (tmo / 2);
Unfortunately, that needs linux/watchdog.h and that contains unfiltered
kernel types on some distros, which breaks the build.
---
use bb_error_msg where appropriate: See
egrep "(printf.*\([[:space:]]*(stderr|2)|[^_]write.*\([[:space:]]*(stderr|2))"
---
use bb_perror_msg where appropriate: See
egrep "[^_]perror"
---
possible code duplication ingroup() and is_a_group_member()
---
Move __get_hz() to a better place and (re)use it in route.c, ash.c
---
See grep -r strtod
Alot of duplication that wants cleanup.
---
unify progress_meter. wget, flash_eraseall, pipe_progress, fbsplash, setfiles.
---
(TODO list after discussion 11.05.2009)
* shrink tc/brctl/ip
tc/brctl seem like fairly large things to try and tackle in your timeframe,
and i think people have posted attempts in the past. Adding additional
options to ip though seems reasonable.
* add tests for some applets
* implement POSIX utilities and audit them for POSIX conformance. then
audit them for GNU conformance. then document all your findings in a new
doc/conformance.txt file while perhaps implementing some of the missing
features.
you can find the latest POSIX documentation (1003.1-2008) here:
http://www.opengroup.org/onlinepubs/9699919799/
and the complete list of all utilities that POSIX covers:
http://www.opengroup.org/onlinepubs/9699919799/idx/utilities.html
The first step would to generate a file/matrix what is already archived
(also IPV6)
* implement 'at'
* rpcbind (former portmap) or equivalent
so that we don't have to use -o nolock on nfs mounts
* check IPV6 compliance
* generate a mini example using kernel+busybox only (+libc) for example
* more support for advanced linux 2.6.x features, see: iotop
most likely there is more

View File

@ -0,0 +1,45 @@
Already fixed applets:
cal
lsmod
df
dumpleases
Applets which may need unicode handling (more extensive than sanitizing
of filenames in error messages):
ls - work in progress
expand, unexpand - uses unicode_strlen, not scrlen
ash, hush through lineedit - uses unicode_strlen, not scrlen
top - need to sanitize process args
ps - need to sanitize process args
less
more
vi
ed
cut
awk
sed
tr
grep egrep fgrep
fold
sort
head, tail
catv - "display nonprinting chars" - what this could mean for unicode?
wc
chat
dumpkmap
last - just line up columns
man
microcom
strings
watch
Unsure, may need fixing:
hostname - do we really want to protect against bad chars in it?
patch
addgroup, adduser, delgroup, deluser
telnet
telnetd
od
printf

3
busybox-1_37_0/applets/.gitignore vendored Normal file
View File

@ -0,0 +1,3 @@
/applet_tables
/usage
/usage_pod

View File

@ -0,0 +1,57 @@
# Makefile for busybox
#
# Copyright (C) 1999-2005 by Erik Andersen <andersen@codepoet.org>
#
# Licensed under GPLv2, see file LICENSE in this source tree.
obj-y :=
obj-y += applets.o
hostprogs-y:=
hostprogs-y += usage usage_pod applet_tables
always:= $(hostprogs-y)
# Generated files need additional love
# This trick decreases amount of rebuilds
# if tree is merely renamed/copied
ifeq ($(srctree),$(objtree))
srctree_slash =
else
srctree_slash = $(srctree)/
endif
HOSTCFLAGS_usage.o = -I$(srctree_slash)include -Iinclude
HOSTCFLAGS_usage_pod.o = -I$(srctree_slash)include -Iinclude
applets/applets.o: include/usage_compressed.h include/applet_tables.h
applets/applet_tables: .config include/applets.h
applets/usage: .config include/applets.h
applets/usage_pod: .config include/applets.h include/applet_tables.h
quiet_cmd_gen_usage_compressed = GEN include/usage_compressed.h
cmd_gen_usage_compressed = $(srctree_slash)applets/usage_compressed include/usage_compressed.h applets
include/usage_compressed.h: applets/usage $(srctree_slash)applets/usage_compressed
$(call cmd,gen_usage_compressed)
quiet_cmd_gen_applet_tables = GEN include/applet_tables.h include/NUM_APPLETS.h
cmd_gen_applet_tables = applets/applet_tables include/applet_tables.h include/NUM_APPLETS.h
include/NUM_APPLETS.h: applets/applet_tables
$(call cmd,gen_applet_tables)
# In fact, include/applet_tables.h depends only on applets/applet_tables,
# and is generated by it. But specifying only it can run
# applets/applet_tables twice, possibly in parallel.
# We say that it also needs NUM_APPLETS.h
#
# Unfortunately, we need to list the same command,
# and it can be executed twice (sequentially).
# The alternative is to not list any command,
# and then if include/applet_tables.h is deleted, it won't be rebuilt.
#
include/applet_tables.h: include/NUM_APPLETS.h applets/applet_tables
$(call cmd,gen_applet_tables)

View File

@ -0,0 +1,244 @@
/* vi: set sw=4 ts=4: */
/*
* Applet table generator.
* Runs on host and produces include/applet_tables.h
*
* Copyright (C) 2007 Denys Vlasenko <vda.linux@googlemail.com>
*
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <limits.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <ctype.h>
#undef ARRAY_SIZE
#define ARRAY_SIZE(x) ((unsigned)(sizeof(x) / sizeof((x)[0])))
#ifndef PATH_MAX
#define PATH_MAX 1024
#endif
#include "../include/autoconf.h"
#include "../include/applet_metadata.h"
struct bb_applet {
const char *name;
const char *main;
enum bb_install_loc_t install_loc;
enum bb_suid_t need_suid;
/* true if instead of fork(); exec("applet"); waitpid();
* one can do fork(); exit(applet_main(argc,argv)); waitpid(); */
unsigned char noexec;
/* Even nicer */
/* true if instead of fork(); exec("applet"); waitpid();
* one can simply call applet_main(argc,argv); */
unsigned char nofork;
};
/* Define struct bb_applet applets[] */
#include "../include/applets.h"
enum { NUM_APPLETS = ARRAY_SIZE(applets) };
static int cmp_name(const void *a, const void *b)
{
const struct bb_applet *aa = a;
const struct bb_applet *bb = b;
return strcmp(aa->name, bb->name);
}
static int str_isalnum_(const char *s)
{
while (*s) {
if (!isalnum((unsigned char)*s) && *s != '_')
return 0;
s++;
}
return 1;
}
int main(int argc, char **argv)
{
int i, j;
char tmp1[PATH_MAX], tmp2[PATH_MAX];
// In find_applet_by_name(), before linear search, narrow it down
// by looking at N "equidistant" names. With ~350 applets:
// KNOWN_APPNAME_OFFSETS cycles
// 0 9057
// 2 4604 + ~100 bytes of code
// 4 2407 + 4 bytes
// 8 1342 + 8 bytes
// 16 908 + 16 bytes
// 32 884 + 32 bytes
// With 8, int16_t applet_nameofs[] table has 7 elements.
int KNOWN_APPNAME_OFFSETS = 8;
// With 128 applets we do two linear searches, with 1..7 strcmp's in the first one
// and 1..16 strcmp's in the second. With 256 apps, second search does 1..32 strcmp's.
if (NUM_APPLETS < 128)
KNOWN_APPNAME_OFFSETS = 4;
if (NUM_APPLETS < 32)
KNOWN_APPNAME_OFFSETS = 0;
qsort(applets, NUM_APPLETS, sizeof(applets[0]), cmp_name);
for (i = j = 0; i < NUM_APPLETS-1; ++i) {
if (cmp_name(applets+i, applets+i+1) == 0) {
fprintf(stderr, "%s: duplicate applet name '%s'\n", argv[0],
applets[i].name);
j = 1;
}
}
if (j != 0 || !argv[1])
return 1;
snprintf(tmp1, PATH_MAX, "%s.%u.new", argv[1], (int) getpid());
i = open(tmp1, O_WRONLY | O_TRUNC | O_CREAT, 0666);
if (i < 0)
return 1;
dup2(i, 1);
/* Keep in sync with include/busybox.h! */
printf("/* This is a generated file, don't edit */\n\n");
printf("#define NUM_APPLETS %u\n", NUM_APPLETS);
if (NUM_APPLETS == 1) {
printf("#define SINGLE_APPLET_STR \"%s\"\n", applets[0].name);
printf("#define SINGLE_APPLET_MAIN %s_main\n", applets[0].main);
}
printf("#define KNOWN_APPNAME_OFFSETS %u\n\n", KNOWN_APPNAME_OFFSETS);
if (KNOWN_APPNAME_OFFSETS > 0) {
int ofs, offset[KNOWN_APPNAME_OFFSETS], index[KNOWN_APPNAME_OFFSETS];
for (i = 0; i < KNOWN_APPNAME_OFFSETS; i++)
index[i] = i * NUM_APPLETS / KNOWN_APPNAME_OFFSETS;
ofs = 0;
for (i = 0; i < NUM_APPLETS; i++) {
for (j = 0; j < KNOWN_APPNAME_OFFSETS; j++)
if (i == index[j])
offset[j] = ofs;
ofs += strlen(applets[i].name) + 1;
}
/* If the list of names is too long refuse to proceed */
if (ofs > 0xffff)
return 1;
printf("const uint16_t applet_nameofs[] ALIGN2 = {\n");
for (i = 1; i < KNOWN_APPNAME_OFFSETS; i++)
printf("%d,\n", offset[i]);
printf("};\n\n");
}
//printf("#ifndef SKIP_definitions\n");
printf("const char applet_names[] ALIGN1 = \"\"\n");
for (i = 0; i < NUM_APPLETS; i++) {
printf("\"%s\" \"\\0\"\n", applets[i].name);
// if (MAX_APPLET_NAME_LEN < strlen(applets[i].name))
// MAX_APPLET_NAME_LEN = strlen(applets[i].name);
}
printf(";\n\n");
for (i = 0; i < NUM_APPLETS; i++) {
if (str_isalnum_(applets[i].name))
printf("#define APPLET_NO_%s %d\n", applets[i].name, i);
}
printf("\n");
printf("#ifndef SKIP_applet_main\n");
printf("int (*const applet_main[])(int argc, char **argv) = {\n");
for (i = 0; i < NUM_APPLETS; i++) {
printf("%s_main,\n", applets[i].main);
}
printf("};\n");
printf("#endif\n\n");
#if ENABLE_FEATURE_PREFER_APPLETS \
|| ENABLE_FEATURE_SH_STANDALONE \
|| ENABLE_FEATURE_SH_NOFORK
printf("const uint8_t applet_flags[] ALIGN1 = {\n");
i = 0;
while (i < NUM_APPLETS) {
int v = applets[i].nofork + (applets[i].noexec << 1);
if (++i < NUM_APPLETS)
v |= (applets[i].nofork + (applets[i].noexec << 1)) << 2;
if (++i < NUM_APPLETS)
v |= (applets[i].nofork + (applets[i].noexec << 1)) << 4;
if (++i < NUM_APPLETS)
v |= (applets[i].nofork + (applets[i].noexec << 1)) << 6;
printf("0x%02x,\n", v);
i++;
}
printf("};\n\n");
#endif
#if ENABLE_FEATURE_SUID
printf("const uint8_t applet_suid[] ALIGN1 = {\n");
i = 0;
while (i < NUM_APPLETS) {
int v = applets[i].need_suid; /* 2 bits */
if (++i < NUM_APPLETS)
v |= applets[i].need_suid << 2;
if (++i < NUM_APPLETS)
v |= applets[i].need_suid << 4;
if (++i < NUM_APPLETS)
v |= applets[i].need_suid << 6;
printf("0x%02x,\n", v);
i++;
}
printf("};\n\n");
#endif
#if ENABLE_FEATURE_INSTALLER
printf("const uint8_t applet_install_loc[] ALIGN1 = {\n");
i = 0;
while (i < NUM_APPLETS) {
int v = applets[i].install_loc; /* 3 bits */
if (++i < NUM_APPLETS)
v |= applets[i].install_loc << 4; /* 3 bits */
printf("0x%02x,\n", v);
i++;
}
printf("};\n");
#endif
//printf("#endif /* SKIP_definitions */\n");
// printf("\n");
// printf("#define MAX_APPLET_NAME_LEN %u\n", MAX_APPLET_NAME_LEN);
if (argv[2]) {
FILE *fp;
char line_new[80];
// char line_old[80];
sprintf(line_new, "#define NUM_APPLETS %u\n", NUM_APPLETS);
// line_old[0] = 0;
// fp = fopen(argv[2], "r");
// if (fp) {
// fgets(line_old, sizeof(line_old), fp);
// fclose(fp);
// }
// if (strcmp(line_old, line_new) != 0) {
snprintf(tmp2, PATH_MAX, "%s.%u.new", argv[2], (int) getpid());
fp = fopen(tmp2, "w");
if (!fp)
return 1;
fputs(line_new, fp);
if (fclose(fp))
return 1;
// }
}
if (fclose(stdout))
return 1;
if (rename(tmp1, argv[1]))
return 1;
if (rename(tmp2, argv[2]))
return 1;
return 0;
}

View File

@ -0,0 +1,16 @@
/* vi: set sw=4 ts=4: */
/*
* Stub for linking busybox binary against libbusybox.
*
* Copyright (C) 2007 Denys Vlasenko <vda.linux@googlemail.com>
*
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
#include "busybox.h"
#if ENABLE_BUILD_LIBBUSYBOX
int main(int argc UNUSED_PARAM, char **argv)
{
return lbb_main(argv);
}
#endif

View File

@ -0,0 +1,24 @@
#!/bin/sh
# Make busybox links list file.
# input $1: full path to Config.h
# input $2: full path to applets.h
# output (stdout): list of pathnames that should be linked to busybox
# Maintainer: Larry Doolittle <ldoolitt@recycle.lbl.gov>
export LC_ALL=POSIX
export LC_CTYPE=POSIX
CONFIG_H=${1:-include/autoconf.h}
APPLETS_H=${2:-include/applets.h}
$HOSTCC -E -DMAKE_LINKS -include $CONFIG_H $APPLETS_H |
awk '/^[ \t]*LINK/{
dir=substr($2,7)
gsub("_","/",dir)
if(dir=="/ROOT") dir=""
file=$3
gsub("\"","",file)
if (file=="busybox") next
print tolower(dir) "/" file
}'

View File

@ -0,0 +1,16 @@
#!/bin/sh
# Make busybox scripted applet list file.
# input $1: full path to Config.h
# input $2: full path to applets.h
# output (stdout): list of pathnames that should be linked to busybox
export LC_ALL=POSIX
export LC_CTYPE=POSIX
CONFIG_H=${1:-include/autoconf.h}
APPLETS_H=${2:-include/applets.h}
$HOSTCC -E -DMAKE_SCRIPTS -include $CONFIG_H $APPLETS_H |
awk '/^[ \t]*SCRIPT/{
print $2
}'

View File

@ -0,0 +1,54 @@
#!/bin/sh
# Make list of configuration variables regarding suid handling
# input $1: full path to autoconf.h
# input $2: full path to applets.h
# input $3: full path to .config
# output (stdout): list of CONFIG_ that do or may require suid
# If the environment variable SUID is not set or set to DROP,
# lists all config options that do not require suid permissions.
# Otherwise, lists all config options for applets that DO or MAY require
# suid permissions.
# Maintainer: Bernhard Reutner-Fischer
export LC_ALL=POSIX
export LC_CTYPE=POSIX
CONFIG_H=${1:-include/autoconf.h}
APPLETS_H=${2:-include/applets.h}
DOT_CONFIG=${3:-.config}
case ${SUID:-DROP} in
[dD][rR][oO][pP]) USE="DROP" ;;
*) USE="suid" ;;
esac
$HOSTCC -E -DMAKE_SUID -include $CONFIG_H $APPLETS_H |
awk -v USE=${USE} '
/^SUID[ \t]/{
if (USE == "DROP") {
if ($2 != "BB_SUID_DROP") next
} else {
if ($2 == "BB_SUID_DROP") next
}
cfg = $NF
gsub("\"", "", cfg)
cfg = substr(cfg, 8)
s[i++] = "CONFIG_" cfg
s[i++] = "CONFIG_FEATURE_" cfg "_.*"
}
END{
while (getline < ARGV[2]) {
for (j in s) {
if ($0 ~ "^" s[j] "=y$") {
sub(/=.*/, "")
print
if (s[j] !~ /\*$/) delete s[j] # can drop this applet now
}
}
}
}
' - $DOT_CONFIG

View File

@ -0,0 +1,24 @@
/* Minimal wrapper to build an individual busybox applet.
*
* Copyright 2005 Rob Landley <rob@landley.net
*
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
const char *applet_name;
#include <stdio.h>
#include <stdlib.h>
#include "usage.h"
int main(int argc, char **argv)
{
applet_name = argv[0];
return APPLET_main(argc, argv);
}
void bb_show_usage(void)
{
fputs_stdout(APPLET_full_usage "\n");
exit_FAILURE();
}

137
busybox-1_37_0/applets/install.sh Executable file
View File

@ -0,0 +1,137 @@
#!/bin/sh
export LC_ALL=POSIX
export LC_CTYPE=POSIX
prefix=$1
if [ -z "$prefix" ]; then
echo "usage: applets/install.sh DESTINATION TYPE [OPTS ...]"
echo " TYPE is one of: --symlinks --hardlinks --binaries --scriptwrapper --none"
echo " OPTS is one or more of: --cleanup --noclobber"
exit 1
fi
shift # Keep only remaining options
# Source the configuration
. ./.config
h=`sort busybox.links | uniq`
sharedlib_dir="0_lib"
linkopts=""
scriptwrapper="n"
binaries="n"
cleanup="0"
noclobber="0"
while [ ${#} -gt 0 ]; do
case "$1" in
--hardlinks) linkopts="-f";;
--symlinks) linkopts="-fs";;
--binaries) binaries="y";;
--scriptwrapper) scriptwrapper="y"; swrapall="y";;
--sw-sh-hard) scriptwrapper="y"; linkopts="-f";;
--sw-sh-sym) scriptwrapper="y"; linkopts="-fs";;
--cleanup) cleanup="1";;
--noclobber) noclobber="1";;
--none) h="";;
*) echo "Unknown install option: $1"; exit 1;;
esac
shift
done
if [ -n "$DO_INSTALL_LIBS" ] && [ x"$DO_INSTALL_LIBS" != x"n" ]; then
# get the target dir for the libs
# assume it starts with lib
libdir=$($CC -print-file-name=libc.so | \
sed -n 's%^.*\(/lib[^\/]*\)/libc.so%\1%p')
if test -z "$libdir"; then
libdir=/lib
fi
mkdir -p "$prefix/$libdir" || exit 1
for i in $DO_INSTALL_LIBS; do
rm -f "$prefix/$libdir/$i" || exit 1
if [ -f "$i" ]; then
echo " Installing $i to the target at $prefix/$libdir/"
cp -pPR "$i" "$prefix/$libdir/" || exit 1
chmod 0644 "$prefix/$libdir/`basename $i`" || exit 1
fi
done
fi
if [ x"$cleanup" = x"1" ] && [ -e "$prefix/bin/busybox" ]; then
inode=`ls -i "$prefix/bin/busybox" | awk '{print $1}'`
sub_shell_it=`
cd "$prefix"
for d in usr/sbin usr/bin sbin bin; do
pd=$PWD
if [ -d "$d" ]; then
cd "$d"
ls -iL . | grep "^ *$inode" | awk '{print $2}' | env -i xargs rm -f
fi
cd "$pd"
done
`
exit 0
fi
rm -f "$prefix/bin/busybox" || exit 1
mkdir -p "$prefix/bin" || exit 1
install -m 755 busybox "$prefix/bin/busybox" || exit 1
for i in $h; do
appdir=`dirname "$i"`
app=`basename "$i"`
if [ x"$noclobber" = x"1" ] && ([ -e "$prefix/$i" ] || [ -h "$prefix/$i" ]); then
echo " $prefix/$i already exists"
continue
fi
mkdir -p "$prefix/$appdir" || exit 1
if [ x"$scriptwrapper" = x"y" ]; then
if [ x"$swrapall" != x"y" ] && [ x"$i" = x"/bin/sh" ]; then
ln $linkopts busybox "$prefix/$i" || exit 1
else
rm -f "$prefix/$i"
echo "#!/bin/busybox" >"$prefix/$i"
chmod +x "$prefix/$i"
fi
echo " $prefix/$i"
elif [ x"$binaries" = x"y" ]; then
# Copy the binary over rather
if [ -e "$sharedlib_dir/$app" ]; then
echo " Copying $sharedlib_dir/$app to $prefix/$i"
cp -pPR "$sharedlib_dir/$app" "$prefix/$i" || exit 1
else
echo "Error: Could not find $sharedlib_dir/$app"
exit 1
fi
else
if [ x"$linkopts" = x"-f" ]; then
bb_path="$prefix/bin/busybox"
else
case "$appdir" in
/)
bb_path="bin/busybox"
;;
/bin)
bb_path="busybox"
;;
/sbin)
bb_path="../bin/busybox"
;;
/usr/bin | /usr/sbin)
bb_path="../../bin/busybox"
;;
*)
echo "Unknown installation directory: $appdir"
exit 1
;;
esac
fi
echo " $prefix/$i -> $bb_path"
ln $linkopts "$bb_path" "$prefix/$i" || exit 1
fi
done
exit 0

View File

@ -0,0 +1,55 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright (C) 2008 Denys Vlasenko.
*
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include "autoconf.h"
/* Since we can't use platform.h, have to do this again by hand: */
#if ENABLE_NOMMU
# define BB_MMU 0
# define USE_FOR_NOMMU(...) __VA_ARGS__
# define USE_FOR_MMU(...)
#else
# define BB_MMU 1
# define USE_FOR_NOMMU(...)
# define USE_FOR_MMU(...) __VA_ARGS__
#endif
#include "usage.h"
#define MAKE_USAGE(aname, usage) { aname, usage },
static struct usage_data {
const char *aname;
const char *usage;
} usage_array[] = {
#include "applets.h"
};
static int compare_func(const void *a, const void *b)
{
const struct usage_data *ua = a;
const struct usage_data *ub = b;
return strcmp(ua->aname, ub->aname);
}
int main(void)
{
int i;
int num_messages = sizeof(usage_array) / sizeof(usage_array[0]);
if (num_messages == 0)
return 0;
qsort(usage_array,
num_messages, sizeof(usage_array[0]),
compare_func);
for (i = 0; i < num_messages; i++)
write(STDOUT_FILENO, usage_array[i].usage, strlen(usage_array[i].usage) + 1);
return 0;
}

View File

@ -0,0 +1,62 @@
#!/bin/sh
target="$1"
loc="$2"
test "$target" || exit 1
test "$loc" || loc=.
test -x "$loc/usage" || exit 1
test "$SED" || SED=sed
test "$DD" || DD=dd
# Some people were bitten by their system lacking a (proper) od
od -v -b </dev/null >/dev/null
if test $? != 0; then
echo 'od tool is not installed or cannot accept "-v -b" options'
exit 1
fi
exec >"$target.$$"
echo '#define UNPACKED_USAGE "" \'
"$loc/usage" | od -v -b \
| grep -v '^ ' \
| $SED -e 's/^[^ ]*//' \
-e 's/ //g' \
-e '/^$/d' \
-e 's/\(...\)/\\\1/g' \
-e 's/^/"/' \
-e 's/$/" \\/'
echo ''
# "grep -v '^ '" is for toybox's od bug: od -b prints some extra lines:
#0000000 010 000 010 000 133 055 144 146 135 040 133 055 143 040 103 117
# 000010 000010 026533 063144 020135 026533 020143 047503
#0000020 116 106 104 111 122 135 040 133 055 154 040 114 117 107 106 111
# 043116 044504 056522 055440 066055 046040 043517 044506
#0000040 114 105 135 040 133 055 141 040 101 103 124 111 117 116 106 111
# 042514 020135 026533 020141 041501 044524 047117 044506
echo "#define UNPACKED_USAGE_LENGTH `$loc/usage | wc -c`"
echo
echo '#define PACKED_USAGE \'
## Breaks on big-endian systems!
## # Extra effort to avoid using "od -t x1": -t is not available
## # in non-CONFIG_DESKTOPed busybox od
##
## "$loc/usage" | bzip2 -1 | od -v -x \
## | $SED -e 's/^[^ ]*//' \
## -e 's/ //g' \
## -e '/^$/d' \
## -e 's/\(..\)\(..\)/0x\2,0x\1,/g'
## -e 's/$/ \\/'
"$loc/usage" | bzip2 -1 | $DD bs=2 skip=1 2>/dev/null | od -v -b \
| grep -v '^ ' \
| $SED -e 's/^[^ ]*//' \
-e 's/ //g' \
-e '/^$/d' \
-e 's/\(...\)/0\1,/g' \
-e 's/$/ \\/'
echo ''
mv -- "$target.$$" "$target"

View File

@ -0,0 +1,113 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright (C) 2009 Denys Vlasenko.
*
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
#include <unistd.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include "autoconf.h"
#define SKIP_applet_main
#define ALIGN1 /* nothing, just to placate applet_tables.h */
#define ALIGN2 /* nothing, just to placate applet_tables.h */
#include "applet_tables.h"
/* Since we can't use platform.h, have to do this again by hand: */
#if ENABLE_NOMMU
# define BB_MMU 0
# define USE_FOR_NOMMU(...) __VA_ARGS__
# define USE_FOR_MMU(...)
#else
# define BB_MMU 1
# define USE_FOR_NOMMU(...)
# define USE_FOR_MMU(...) __VA_ARGS__
#endif
#include "usage.h"
#define MAKE_USAGE(aname, usage) { aname, usage },
static struct usage_data {
const char *aname;
const char *usage;
} usage_array[] = {
#include "applets.h"
};
static int compare_func(const void *a, const void *b)
{
const struct usage_data *ua = a;
const struct usage_data *ub = b;
return strcmp(ua->aname, ub->aname);
}
int main(void)
{
int col, len2;
int i;
int num_messages = sizeof(usage_array) / sizeof(usage_array[0]);
if (num_messages == 0)
return 0;
qsort(usage_array,
num_messages, sizeof(usage_array[0]),
compare_func);
col = 0;
for (i = 0; i < num_messages; i++) {
len2 = strlen(usage_array[i].aname) + 2;
if (col >= 76 - len2) {
printf(",\n");
col = 0;
}
if (col == 0) {
col = 6;
printf("\t");
} else {
printf(", ");
}
printf("%s", usage_array[i].aname);
col += len2;
}
printf("\n\n");
printf("=head1 COMMAND DESCRIPTIONS\n\n");
printf("=over 4\n\n");
for (i = 0; i < num_messages; i++) {
if (usage_array[i].aname[0] >= 'a' && usage_array[i].aname[0] <= 'z'
&& usage_array[i].usage[0] != NOUSAGE_STR[0]
) {
printf("=item B<%s>\n\n", usage_array[i].aname);
if (usage_array[i].usage[0])
printf("%s %s\n\n", usage_array[i].aname, usage_array[i].usage);
else
printf("%s\n\n", usage_array[i].aname);
}
}
printf("=back\n\n");
return 0;
}
/* TODO: we used to make options bold with B<> and output an example too:
=item B<cat>
cat [B<-u>] [FILE]...
Concatenate FILE(s) and print them to stdout
Options:
-u Use unbuffered i/o (ignored)
Example:
$ cat /proc/uptime
110716.72 17.67
*/

39
busybox-1_37_0/applets_sh/mim Executable file
View File

@ -0,0 +1,39 @@
#!/bin/sh
MIMFILE="Mimfile"
if [ $# -ge 2 ] && [ "$1" = "-f" ]
then
MIMFILE="$2"
shift 2
fi
exec <"$MIMFILE" || exit 1
{
INCASE=false
while read -r REPLY
do
case $REPLY in
*:)
if ! $INCASE
then
printf '[ $# -eq 0 ] && set -- "%s"
TARGET="$1"
shift
case "$TARGET" in
' "${REPLY%:}"
else
printf ';;\n'
fi
printf '%s)\n' "${REPLY%:}"
INCASE=true
;;
"") ;;
*) printf '%s\n' "${REPLY##[ ]}";;
esac
done
$INCASE && printf ';;\n'
printf '*)
echo "Unknown command $TARGET"
exit 1
;;
esac
'
} | sh -s "$@"

View File

@ -0,0 +1,3 @@
cat /etc/nologin.txt 2>/dev/null || echo This account is not available
sleep 5
exit 1

View File

@ -0,0 +1,21 @@
# ==========================================================================
# Build system
# ==========================================================================
# Allow i486 insns (basically, bswap insn)
# Do not try to tune for 486+ (might add padding)
CFLAGS += $(call cc-option,-march=i486 -mtune=i386,)
ifeq ($(CONFIG_STACK_OPTIMIZATION_386),y)
# -mpreferred-stack-boundary=2 is essential in preventing gcc 4.2.x
# from aligning stack to 16 bytes. (Which is gcc's way of supporting SSE).
CFLAGS += $(call cc-option,-mpreferred-stack-boundary=2,)
endif
# "Control how GCC aligns variables.
# Supported values for type are compat uses increased alignment value
# compatible uses GCC 4.8 and earlier, abi uses alignment value as specified by the psABI,
# and cacheline uses increased alignment value to match the cache line size.
# compat is the default."
# "abi" seems to be somewhat successful in preventing oversealous data alignment.
CFLAGS += $(call cc-option,-malign-data=abi,)

View File

@ -0,0 +1,11 @@
# When building a library, even intra-library references,
# such as from find_applet_by_name() to applet_names[],
# don't work with -fpic on sparc, needs -fPIC.
# Don't know why it fails in this case but works when
# a binary is being built.
#
# (if is superfluous, ARCH_FPIC is only used by library build, but it
# demonstrates the point: non-pic binary does not need it)
ifeq ($(CONFIG_BUILD_LIBBUSYBOX),y)
ARCH_FPIC = -fPIC
endif

View File

@ -0,0 +1,11 @@
# When building a library, even intra-library references,
# such as from find_applet_by_name() to applet_names[],
# don't work with -fpic on sparc, needs -fPIC.
# Don't know why it fails in this case but works when
# a binary is being built.
#
# (if is superfluous, ARCH_FPIC is only used by library build, but it
# demonstrates the point: non-pic binary does not need it)
ifeq ($(CONFIG_BUILD_LIBBUSYBOX),y)
ARCH_FPIC = -fPIC
endif

View File

@ -0,0 +1,11 @@
# ==========================================================================
# Build system
# ==========================================================================
# "Control how GCC aligns variables.
# Supported values for type are compat uses increased alignment value
# compatible uses GCC 4.8 and earlier, abi uses alignment value as specified by the psABI,
# and cacheline uses increased alignment value to match the cache line size.
# compat is the default."
# "abi" seems to be somewhat successful in preventing oversealous data alignment.
CFLAGS += $(call cc-option,-malign-data=abi,)

View File

@ -0,0 +1,38 @@
#
# For a description of the syntax of this configuration file,
# see docs/Kconfig-language.txt.
#
menu "Archival Utilities"
config FEATURE_SEAMLESS_XZ
bool "Make tar, rpm, modprobe etc understand .xz data"
default y
config FEATURE_SEAMLESS_LZMA
bool "Make tar, rpm, modprobe etc understand .lzma data"
default y
config FEATURE_SEAMLESS_BZ2
bool "Make tar, rpm, modprobe etc understand .bz2 data"
default y
config FEATURE_SEAMLESS_GZ
bool "Make tar, rpm, modprobe etc understand .gz data"
default y
config FEATURE_SEAMLESS_Z
bool "Make tar, rpm, modprobe etc understand .Z data"
default n # it is ancient
INSERT
config FEATURE_LZMA_FAST
bool "Optimize lzma for speed"
default n
depends on UNLZMA || LZCAT || LZMA || FEATURE_SEAMLESS_LZMA
help
This option reduces decompression time by about 25% at the cost of
a 1K bigger binary.
endmenu

View File

@ -0,0 +1,11 @@
# Makefile for busybox
#
# Copyright (C) 1999-2005 by Erik Andersen <andersen@codepoet.org>
#
# Licensed under GPLv2, see file LICENSE in this source tree.
libs-y += libarchive/
lib-y:=
INSERT

View File

@ -0,0 +1,301 @@
/* vi: set sw=4 ts=4: */
/*
* Mini ar implementation for busybox
*
* Copyright (C) 2000 by Glenn McGrath
*
* Based in part on BusyBox tar, Debian dpkg-deb and GNU ar.
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*
* Archive creation support:
* Copyright (C) 2010 Nokia Corporation. All rights reserved.
* Written by Alexander Shishkin.
*
* There is no single standard to adhere to so ar may not portable
* between different systems
* http://www.unix-systems.org/single_unix_specification_v2/xcu/ar.html
*/
//config:config AR
//config: bool "ar (9.5 kb)"
//config: default n # needs to be improved to be able to replace binutils ar
//config: help
//config: ar is an archival utility program used to create, modify, and
//config: extract contents from archives. In practice, it is used exclusively
//config: for object module archives used by compilers.
//config:
//config: Unless you have a specific application which requires ar, you should
//config: probably say N here: most compilers come with their own ar utility.
//config:
//config:config FEATURE_AR_LONG_FILENAMES
//config: bool "Support long filenames (not needed for debs)"
//config: default y
//config: depends on AR
//config: help
//config: By default the ar format can only store the first 15 characters
//config: of the filename, this option removes that limitation.
//config: It supports the GNU ar long filename method which moves multiple long
//config: filenames into a the data section of a new ar entry.
//config:
//config:config FEATURE_AR_CREATE
//config: bool "Support archive creation"
//config: default y
//config: depends on AR
//config: help
//config: This enables archive creation (-c and -r) with busybox ar.
//applet:IF_AR(APPLET(ar, BB_DIR_USR_BIN, BB_SUID_DROP))
//kbuild:lib-$(CONFIG_AR) += ar.o
#include "libbb.h"
#include "bb_archive.h"
#include "ar_.h"
#if ENABLE_FEATURE_AR_CREATE
/* filter out entries with same names as specified on the command line */
static char FAST_FUNC filter_replaceable(archive_handle_t *handle)
{
if (find_list_entry(handle->accept, handle->file_header->name))
return EXIT_FAILURE;
return EXIT_SUCCESS;
}
static void output_ar_header(archive_handle_t *handle)
{
/* GNU ar 2.19.51.0.14 creates malformed archives
* if input files are >10G. It also truncates files >4GB
* (uses "size mod 4G"). We abort in this case:
* We could add support for up to 10G files, but this is unlikely to be useful.
* Note that unpacking side limits all fields to "unsigned int" data type,
* and treats "all ones" as an error indicator. Thus max we allow here is UINT_MAX-1.
*/
enum {
/* for 2nd field: mtime */
MAX11CHARS = UINT_MAX > 0xffffffff ? (unsigned)99999999999 : UINT_MAX-1,
/* for last field: filesize */
MAX10CHARS = UINT_MAX > 0xffffffff ? (unsigned)9999999999 : UINT_MAX-1,
};
struct file_header_t *fh = handle->file_header;
if (handle->offset & 1) {
xwrite(handle->src_fd, "\n", 1);
handle->offset++;
}
/* Careful! The widths should be exact. Fields must be separated */
if (sizeof(off_t) > 4 && fh->size > (off_t)MAX10CHARS) {
bb_error_msg_and_die("'%s' is bigger than ar can handle", fh->name);
}
fdprintf(handle->src_fd, "%-16.16s%-12lu%-6u%-6u%-8o%-10"OFF_FMT"u`\n",
fh->name,
(sizeof(time_t) > 4 && fh->mtime > MAX11CHARS) ? (long)0 : (long)fh->mtime,
fh->uid > 99999 ? 0 : (int)fh->uid,
fh->gid > 99999 ? 0 : (int)fh->gid,
(int)fh->mode & 07777777,
fh->size
);
handle->offset += AR_HEADER_LEN;
}
/*
* when replacing files in an existing archive, copy from the
* original archive those files that are to be left intact
*/
static void FAST_FUNC copy_data(archive_handle_t *handle)
{
archive_handle_t *out_handle = handle->ar__out;
struct file_header_t *fh = handle->file_header;
out_handle->file_header = fh;
output_ar_header(out_handle);
bb_copyfd_exact_size(handle->src_fd, out_handle->src_fd, fh->size);
out_handle->offset += fh->size;
}
static int write_ar_header(archive_handle_t *handle)
{
char *fn;
char fn_h[17]; /* 15 + "/" + NUL */
struct stat st;
int fd;
fn = llist_pop(&handle->accept);
if (!fn)
return -1;
xstat(fn, &st);
handle->file_header->mtime = st.st_mtime;
handle->file_header->uid = st.st_uid;
handle->file_header->gid = st.st_gid;
handle->file_header->mode = st.st_mode;
handle->file_header->size = st.st_size;
handle->file_header->name = fn_h;
//TODO: if ENABLE_FEATURE_AR_LONG_FILENAMES...
sprintf(fn_h, "%.15s/", bb_basename(fn));
output_ar_header(handle);
fd = xopen(fn, O_RDONLY);
bb_copyfd_exact_size(fd, handle->src_fd, st.st_size);
close(fd);
handle->offset += st.st_size;
return 0;
}
static int write_ar_archive(archive_handle_t *handle)
{
struct stat st;
archive_handle_t *out_handle;
xfstat(handle->src_fd, &st, handle->ar__name);
/* if archive exists, create a new handle for output.
* we create it in place of the old one.
*/
if (st.st_size != 0) {
out_handle = init_handle();
xunlink(handle->ar__name);
out_handle->src_fd = xopen(handle->ar__name, O_WRONLY | O_CREAT | O_TRUNC);
out_handle->accept = handle->accept;
} else {
out_handle = handle;
}
handle->ar__out = out_handle;
xwrite(out_handle->src_fd, AR_MAGIC "\n", AR_MAGIC_LEN + 1);
out_handle->offset += AR_MAGIC_LEN + 1;
/* skip to the end of the archive if we have to append stuff */
if (st.st_size != 0) {
handle->filter = filter_replaceable;
handle->action_data = copy_data;
unpack_ar_archive(handle);
}
while (write_ar_header(out_handle) == 0)
continue;
/* optional, since we exit right after we return */
if (ENABLE_FEATURE_CLEAN_UP) {
close(handle->src_fd);
if (out_handle->src_fd != handle->src_fd)
close(out_handle->src_fd);
}
return EXIT_SUCCESS;
}
#endif /* FEATURE_AR_CREATE */
static void FAST_FUNC header_verbose_list_ar(const file_header_t *file_header)
{
char mode[12];
char *mtime;
bb_mode_string(mode, file_header->mode);
mtime = ctime(&file_header->mtime);
mtime[16] = ' ';
memmove(&mtime[17], &mtime[20], 4);
mtime[21] = '\0';
printf("%s %u/%u%7"OFF_FMT"u %s %s\n", &mode[1],
(int)file_header->uid, (int)file_header->gid,
file_header->size,
&mtime[4], file_header->name
);
}
//usage:#define ar_trivial_usage
//usage: "x|p|t"IF_FEATURE_AR_CREATE("|r")" [-ov] ARCHIVE [FILE]..."
//usage:#define ar_full_usage "\n\n"
//usage: "Extract or list FILEs from an ar archive"IF_FEATURE_AR_CREATE(", or create it")"\n"
//usage: "\n x Extract"
//usage: "\n p Extract to stdout"
//usage: "\n t List"
//usage: IF_FEATURE_AR_CREATE(
//usage: "\n r Create"
//usage: )
//usage: "\n -o Restore mtime"
//usage: "\n -v Verbose"
int ar_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int ar_main(int argc UNUSED_PARAM, char **argv)
{
archive_handle_t *archive_handle;
unsigned opt, t;
enum {
OPT_VERBOSE = (1 << 0),
OPT_PRESERVE_DATE = (1 << 1),
/* "ar r" implies create, but warns about it. c suppresses warning.
* bbox accepts but ignores it: */
OPT_CREATE = (1 << 2),
CMD_PRINT = (1 << 3),
FIRST_CMD = CMD_PRINT,
CMD_LIST = (1 << 4),
CMD_EXTRACT = (1 << 5),
CMD_INSERT = ((1 << 6) * ENABLE_FEATURE_AR_CREATE),
};
archive_handle = init_handle();
/* prepend '-' to the first argument if required */
if (argv[1] && argv[1][0] != '-' && argv[1][0] != '\0')
argv[1] = xasprintf("-%s", argv[1]);
opt = getopt32(argv, "^"
"voc""ptx"IF_FEATURE_AR_CREATE("r")
"\0"
/* -1: at least one arg is reqd */
/* one of p,t,x[,r] is required */
"-1:p:t:x"IF_FEATURE_AR_CREATE(":r")
);
argv += optind;
t = opt / FIRST_CMD;
if (t & (t-1)) /* more than one of p,t,x[,r] are specified */
bb_show_usage();
if (opt & CMD_PRINT) {
archive_handle->action_data = data_extract_to_stdout;
}
if (opt & CMD_LIST) {
archive_handle->action_header = header_list;
}
if (opt & CMD_EXTRACT) {
archive_handle->action_data = data_extract_all;
}
if (opt & OPT_PRESERVE_DATE) {
archive_handle->ah_flags |= ARCHIVE_RESTORE_DATE;
}
if (opt & OPT_VERBOSE) {
archive_handle->action_header = header_verbose_list_ar;
}
#if ENABLE_FEATURE_AR_CREATE
archive_handle->ar__name = *argv;
#endif
archive_handle->src_fd = xopen(*argv++,
(opt & CMD_INSERT)
? O_RDWR | O_CREAT
: O_RDONLY
);
if (*argv)
archive_handle->filter = filter_accept_list;
while (*argv) {
llist_add_to_end(&archive_handle->accept, *argv++);
}
#if ENABLE_FEATURE_AR_CREATE
if (opt & CMD_INSERT)
return write_ar_archive(archive_handle);
#endif
unpack_ar_archive(archive_handle);
return EXIT_SUCCESS;
}

View File

@ -0,0 +1,603 @@
/* vi: set sw=4 ts=4: */
/*
* Common code for gunzip-like applets
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
//kbuild:lib-$(CONFIG_ZCAT) += bbunzip.o
//kbuild:lib-$(CONFIG_GUNZIP) += bbunzip.o
//kbuild:lib-$(CONFIG_BZCAT) += bbunzip.o
//kbuild:lib-$(CONFIG_BUNZIP2) += bbunzip.o
/* lzop_main() uses bbunpack(), need this: */
//kbuild:lib-$(CONFIG_LZOP) += bbunzip.o
//kbuild:lib-$(CONFIG_LZOPCAT) += bbunzip.o
//kbuild:lib-$(CONFIG_UNLZOP) += bbunzip.o
/* bzip2_main() too: */
//kbuild:lib-$(CONFIG_BZIP2) += bbunzip.o
/* gzip_main() too: */
//kbuild:lib-$(CONFIG_GZIP) += bbunzip.o
#include "libbb.h"
#include "bb_archive.h"
static
int open_to_or_warn(int to_fd, const char *filename, int flags, int mode)
{
int fd = open3_or_warn(filename, flags, mode);
if (fd < 0) {
return 1;
}
xmove_fd(fd, to_fd);
return 0;
}
char* FAST_FUNC append_ext(char *filename, const char *expected_ext)
{
return xasprintf("%s.%s", filename, expected_ext);
}
int FAST_FUNC bbunpack(char **argv,
IF_DESKTOP(long long) int FAST_FUNC (*unpacker)(transformer_state_t *xstate),
char* FAST_FUNC (*make_new_name)(char *filename, const char *expected_ext),
const char *expected_ext
)
{
struct stat stat_buf;
IF_DESKTOP(long long) int status = 0;
char *filename, *new_name;
smallint exitcode = 0;
transformer_state_t xstate;
do {
/* NB: new_name is *maybe* malloc'ed! */
new_name = NULL;
filename = *argv; /* can be NULL - 'streaming' bunzip2 */
if (filename && LONE_DASH(filename))
filename = NULL;
/* Open src */
if (filename) {
if (!(option_mask32 & BBUNPK_SEAMLESS_MAGIC)) {
if (stat(filename, &stat_buf) != 0) {
err_name:
bb_simple_perror_msg(filename);
err:
exitcode = 1;
goto free_name;
}
if (open_to_or_warn(STDIN_FILENO, filename, O_RDONLY, 0))
goto err;
} else {
/* "clever zcat" with FILE */
/* fail_if_not_compressed because zcat refuses uncompressed input */
int fd = open_zipped(filename, /*fail_if_not_compressed:*/ 1);
if (fd < 0)
goto err_name;
xmove_fd(fd, STDIN_FILENO);
}
} else
if (option_mask32 & BBUNPK_SEAMLESS_MAGIC) {
/* "clever zcat" on stdin */
if (setup_unzip_on_fd(STDIN_FILENO, /*fail_if_not_compressed*/ 1))
goto err;
}
/* Special cases: test, stdout */
if (option_mask32 & (BBUNPK_OPT_STDOUT|BBUNPK_OPT_TEST)) {
if (option_mask32 & BBUNPK_OPT_TEST)
if (open_to_or_warn(STDOUT_FILENO, bb_dev_null, O_WRONLY, 0))
xfunc_die();
filename = NULL;
}
/* Open dst if we are going to unpack to file */
if (filename) {
new_name = make_new_name(filename, expected_ext);
if (!new_name) {
bb_error_msg("%s: unknown suffix - ignored", filename);
goto err;
}
/* -f: overwrite existing output files */
if (option_mask32 & BBUNPK_OPT_FORCE) {
unlink(new_name);
}
/* O_EXCL: "real" bunzip2 doesn't overwrite files */
/* GNU gunzip does not bail out, but goes to next file */
if (open_to_or_warn(STDOUT_FILENO, new_name, O_WRONLY | O_CREAT | O_EXCL,
stat_buf.st_mode))
goto err;
}
/* Check that the input is sane */
if (!(option_mask32 & BBUNPK_OPT_FORCE) && isatty(STDIN_FILENO)) {
bb_simple_error_msg_and_die("compressed data not read from terminal, "
"use -f to force it");
}
if (!(option_mask32 & BBUNPK_SEAMLESS_MAGIC)) {
init_transformer_state(&xstate);
/*xstate.signature_skipped = 0; - already is */
/*xstate.src_fd = STDIN_FILENO; - already is */
xstate.dst_fd = STDOUT_FILENO;
status = unpacker(&xstate);
if (status < 0)
exitcode = 1;
} else {
if (bb_copyfd_eof(STDIN_FILENO, STDOUT_FILENO) < 0)
/* Disk full, tty closed, etc. No point in continuing */
xfunc_die();
}
if (!(option_mask32 & BBUNPK_OPT_STDOUT))
xclose(STDOUT_FILENO); /* with error check! */
if (filename) {
char *del = new_name;
if (status >= 0) {
unsigned new_name_len;
/* TODO: restore other things? */
if (xstate.mtime != 0) {
struct timeval times[2];
times[1].tv_sec = times[0].tv_sec = xstate.mtime;
times[1].tv_usec = times[0].tv_usec = 0;
/* Note: we closed it first.
* On some systems calling utimes
* then closing resets the mtime
* back to current time. */
utimes(new_name, times); /* ignoring errors */
}
if (ENABLE_DESKTOP)
new_name_len = strlen(new_name);
/* Restore source filename (unless tgz -> tar case) */
if (new_name == filename) {
new_name_len = strlen(filename);
filename[new_name_len] = '.';
}
/* Extreme bloat for gunzip compat */
/* Some users do want this info... */
if (ENABLE_DESKTOP && (option_mask32 & BBUNPK_OPT_VERBOSE)) {
unsigned percent = status
? ((uoff_t)stat_buf.st_size * 100u / (unsigned long long)status)
: 0;
fprintf(stderr, "%s: %u%% - replaced with %.*s\n",
filename,
100u - percent,
new_name_len, new_name
);
}
/* Delete _source_ file */
del = filename;
if (option_mask32 & BBUNPK_OPT_KEEP) /* ... unless -k */
del = NULL;
}
if (del)
xunlink(del);
free_name:
if (new_name != filename)
free(new_name);
}
} while (*argv && *++argv);
if (option_mask32 & BBUNPK_OPT_STDOUT)
xclose(STDOUT_FILENO); /* with error check! */
return exitcode;
}
#if ENABLE_UNCOMPRESS \
|| ENABLE_FEATURE_BZIP2_DECOMPRESS \
|| ENABLE_UNLZMA || ENABLE_LZCAT || ENABLE_LZMA \
|| ENABLE_UNXZ || ENABLE_XZCAT || ENABLE_XZ
static
char* FAST_FUNC make_new_name_generic(char *filename, const char *expected_ext)
{
char *extension = strrchr(filename, '.');
if (!extension || strcmp(extension + 1, expected_ext) != 0) {
/* Mimic GNU gunzip - "real" bunzip2 tries to */
/* unpack file anyway, to file.out */
return NULL;
}
*extension = '\0';
return filename;
}
#endif
/*
* Uncompress applet for busybox (c) 2002 Glenn McGrath
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
//usage:#define uncompress_trivial_usage
//usage: "[-cf] [FILE]..."
//usage:#define uncompress_full_usage "\n\n"
//usage: "Decompress FILEs (or stdin)\n"
//usage: "\n -c Write to stdout"
//usage: "\n -f Overwrite"
//config:config UNCOMPRESS
//config: bool "uncompress (7.1 kb)"
//config: default n # ancient
//config: help
//config: uncompress is used to decompress archives created by compress.
//config: Not much used anymore, replaced by gzip/gunzip.
//applet:IF_UNCOMPRESS(APPLET(uncompress, BB_DIR_BIN, BB_SUID_DROP))
//kbuild:lib-$(CONFIG_UNCOMPRESS) += bbunzip.o
#if ENABLE_UNCOMPRESS
int uncompress_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int uncompress_main(int argc UNUSED_PARAM, char **argv)
{
// (N)compress 4.2.4.4:
// -d If given, decompression is done instead
// -c Write output on stdout, don't remove original
// -b Parameter limits the max number of bits/code
// -f Forces output file to be generated
// -v Write compression statistics
// -V Output vesion and compile options
// -r Recursive. If a filename is a directory, descend into it and compress everything
getopt32(argv, "cf");
argv += optind;
return bbunpack(argv, unpack_Z_stream, make_new_name_generic, "Z");
}
#endif
/*
* Gzip implementation for busybox
*
* Based on GNU gzip v1.2.4 Copyright (C) 1992-1993 Jean-loup Gailly.
*
* Originally adjusted for busybox by Sven Rudolph <sr1@inf.tu-dresden.de>
* based on gzip sources
*
* Adjusted further by Erik Andersen <andersen@codepoet.org> to support files as
* well as stdin/stdout, and to generally behave itself wrt command line
* handling.
*
* General cleanup to better adhere to the style guide and make use of standard
* busybox functions by Glenn McGrath
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*
* gzip (GNU zip) -- compress files with zip algorithm and 'compress' interface
* Copyright (C) 1992-1993 Jean-loup Gailly
* The unzip code was written and put in the public domain by Mark Adler.
* Portions of the lzw code are derived from the public domain 'compress'
* written by Spencer Thomas, Joe Orost, James Woods, Jim McKie, Steve Davies,
* Ken Turkowski, Dave Mack and Peter Jannesen.
*/
//usage:#define gunzip_trivial_usage
//usage: "[-cfkt] [FILE]..."
//usage:#define gunzip_full_usage "\n\n"
//usage: "Decompress FILEs (or stdin)\n"
//usage: "\n -c Write to stdout"
//usage: "\n -f Force"
//usage: "\n -k Keep input files"
//usage: "\n -t Test integrity"
//usage:
//usage:#define gunzip_example_usage
//usage: "$ ls -la /tmp/BusyBox*\n"
//usage: "-rw-rw-r-- 1 andersen andersen 557009 Apr 11 10:55 /tmp/BusyBox-0.43.tar.gz\n"
//usage: "$ gunzip /tmp/BusyBox-0.43.tar.gz\n"
//usage: "$ ls -la /tmp/BusyBox*\n"
//usage: "-rw-rw-r-- 1 andersen andersen 1761280 Apr 14 17:47 /tmp/BusyBox-0.43.tar\n"
//usage:
//usage:#define zcat_trivial_usage
//usage: "[FILE]..."
//usage:#define zcat_full_usage "\n\n"
//usage: "Decompress to stdout"
//config:config GUNZIP
//config: bool "gunzip (11 kb)"
//config: default y
//config: select FEATURE_GZIP_DECOMPRESS
//config: help
//config: gunzip is used to decompress archives created by gzip.
//config: You can use the '-t' option to test the integrity of
//config: an archive, without decompressing it.
//config:
//config:config ZCAT
//config: bool "zcat (24 kb)"
//config: default y
//config: select FEATURE_GZIP_DECOMPRESS
//config: help
//config: Alias to "gunzip -c".
//config:
//config:config FEATURE_GUNZIP_LONG_OPTIONS
//config: bool "Enable long options"
//config: default y
//config: depends on (GUNZIP || ZCAT) && LONG_OPTS
//applet:IF_GUNZIP(APPLET(gunzip, BB_DIR_BIN, BB_SUID_DROP))
// APPLET_ODDNAME:name main location suid_type help
//applet:IF_ZCAT(APPLET_ODDNAME(zcat, gunzip, BB_DIR_BIN, BB_SUID_DROP, zcat))
#if ENABLE_FEATURE_GZIP_DECOMPRESS
static
char* FAST_FUNC make_new_name_gunzip(char *filename, const char *expected_ext UNUSED_PARAM)
{
char *extension = strrchr(filename, '.');
if (!extension)
return NULL;
extension++;
if (strcmp(extension, "tgz" + 1) == 0
#if ENABLE_FEATURE_SEAMLESS_Z
|| (extension[0] == 'Z' && extension[1] == '\0')
#endif
) {
extension[-1] = '\0';
} else if (strcmp(extension, "tgz") == 0) {
filename = xstrdup(filename);
extension = strrchr(filename, '.');
extension[2] = 'a';
extension[3] = 'r';
} else {
return NULL;
}
return filename;
}
#if ENABLE_FEATURE_GUNZIP_LONG_OPTIONS
static const char gunzip_longopts[] ALIGN1 =
"stdout\0" No_argument "c"
"to-stdout\0" No_argument "c"
"force\0" No_argument "f"
"test\0" No_argument "t"
"no-name\0" No_argument "n"
;
#endif
/*
* Linux kernel build uses gzip -d -n. We accept and ignore it.
* Man page says:
* -n --no-name
* gzip: do not save the original file name and time stamp.
* (The original name is always saved if the name had to be truncated.)
* gunzip: do not restore the original file name/time even if present
* (remove only the gzip suffix from the compressed file name).
* This option is the default when decompressing.
* -N --name
* gzip: always save the original file name and time stamp (this is the default)
* gunzip: restore the original file name and time stamp if present.
*/
int gunzip_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int gunzip_main(int argc UNUSED_PARAM, char **argv)
{
#if ENABLE_FEATURE_GUNZIP_LONG_OPTIONS
getopt32long(argv, BBUNPK_OPTSTR "dtn", gunzip_longopts);
#else
getopt32(argv, BBUNPK_OPTSTR "dtn");
#endif
argv += optind;
/* If called as zcat...
* Normally, "zcat" is just "gunzip -c".
* But if seamless magic is enabled, then we are much more clever.
*/
if (ENABLE_ZCAT && applet_name[1] == 'c')
option_mask32 |= BBUNPK_OPT_STDOUT | BBUNPK_SEAMLESS_MAGIC;
return bbunpack(argv, unpack_gz_stream, make_new_name_gunzip, /*unused:*/ NULL);
}
#endif /* FEATURE_GZIP_DECOMPRESS */
/*
* Modified for busybox by Glenn McGrath
* Added support output to stdout by Thomas Lundquist <thomasez@zelow.no>
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
//usage:#define bunzip2_trivial_usage
//usage: "[-cfk] [FILE]..."
//usage:#define bunzip2_full_usage "\n\n"
//usage: "Decompress FILEs (or stdin)\n"
//usage: "\n -c Write to stdout"
//usage: "\n -f Force"
//usage: "\n -k Keep input files"
//usage: "\n -t Test integrity"
//usage:
//usage:#define bzcat_trivial_usage
//usage: "[FILE]..."
//usage:#define bzcat_full_usage "\n\n"
//usage: "Decompress to stdout"
//config:config BUNZIP2
//config: bool "bunzip2 (9.1 kb)"
//config: default y
//config: select FEATURE_BZIP2_DECOMPRESS
//config: help
//config: bunzip2 is a compression utility using the Burrows-Wheeler block
//config: sorting text compression algorithm, and Huffman coding. Compression
//config: is generally considerably better than that achieved by more
//config: conventional LZ77/LZ78-based compressors, and approaches the
//config: performance of the PPM family of statistical compressors.
//config:
//config: Unless you have a specific application which requires bunzip2, you
//config: should probably say N here.
//config:
//config:config BZCAT
//config: bool "bzcat (9 kb)"
//config: default y
//config: select FEATURE_BZIP2_DECOMPRESS
//config: help
//config: Alias to "bunzip2 -c".
//applet:IF_BUNZIP2(APPLET(bunzip2, BB_DIR_USR_BIN, BB_SUID_DROP))
// APPLET_ODDNAME:name main location suid_type help
//applet:IF_BZCAT(APPLET_ODDNAME(bzcat, bunzip2, BB_DIR_USR_BIN, BB_SUID_DROP, bzcat))
#if ENABLE_FEATURE_BZIP2_DECOMPRESS || ENABLE_BUNZIP2 || ENABLE_BZCAT
int bunzip2_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int bunzip2_main(int argc UNUSED_PARAM, char **argv)
{
getopt32(argv, BBUNPK_OPTSTR "dt");
argv += optind;
if (ENABLE_BZCAT && (!ENABLE_BUNZIP2 || applet_name[2] == 'c')) /* bzcat */
option_mask32 |= BBUNPK_OPT_STDOUT;
return bbunpack(argv, unpack_bz2_stream, make_new_name_generic, "bz2");
}
#endif
/*
* Small lzma deflate implementation.
* Copyright (C) 2006 Aurelien Jacobs <aurel@gnuage.org>
*
* Based on bunzip.c from busybox
*
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
//usage:#define unlzma_trivial_usage
//usage: "[-cfk] [FILE]..."
//usage:#define unlzma_full_usage "\n\n"
//usage: "Decompress FILEs (or stdin)\n"
//usage: "\n -c Write to stdout"
//usage: "\n -f Force"
//usage: "\n -k Keep input files"
//usage: "\n -t Test integrity"
//usage:
//usage:#define lzma_trivial_usage
//usage: "-d [-cfk] [FILE]..."
//usage:#define lzma_full_usage "\n\n"
//usage: "Decompress FILEs (or stdin)\n"
//usage: "\n -d Decompress"
//usage: "\n -c Write to stdout"
//usage: "\n -f Force"
//usage: "\n -k Keep input files"
//usage: "\n -t Test integrity"
//usage:
//usage:#define lzcat_trivial_usage
//usage: "[FILE]..."
//usage:#define lzcat_full_usage "\n\n"
//usage: "Decompress to stdout"
//config:config UNLZMA
//config: bool "unlzma (7.8 kb)"
//config: default y
//config: help
//config: unlzma is a compression utility using the Lempel-Ziv-Markov chain
//config: compression algorithm, and range coding. Compression
//config: is generally considerably better than that achieved by the bzip2
//config: compressors.
//config:
//config:config LZCAT
//config: bool "lzcat (7.8 kb)"
//config: default y
//config: help
//config: Alias to "unlzma -c".
//config:
//config:config LZMA
//config: bool "lzma -d"
//config: default y
//config: help
//config: Enable this option if you want commands like "lzma -d" to work.
//config: IOW: you'll get lzma applet, but it will always require -d option.
//applet:IF_UNLZMA(APPLET(unlzma, BB_DIR_USR_BIN, BB_SUID_DROP))
// APPLET_ODDNAME:name main location suid_type help
//applet:IF_LZCAT(APPLET_ODDNAME(lzcat, unlzma, BB_DIR_USR_BIN, BB_SUID_DROP, lzcat))
//applet:IF_LZMA( APPLET_ODDNAME(lzma, unlzma, BB_DIR_USR_BIN, BB_SUID_DROP, lzma))
//kbuild:lib-$(CONFIG_UNLZMA) += bbunzip.o
//kbuild:lib-$(CONFIG_LZCAT) += bbunzip.o
//kbuild:lib-$(CONFIG_LZMA) += bbunzip.o
#if ENABLE_UNLZMA || ENABLE_LZCAT || ENABLE_LZMA
int unlzma_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int unlzma_main(int argc UNUSED_PARAM, char **argv)
{
IF_LZMA(int opts =) getopt32(argv, BBUNPK_OPTSTR "dt");
# if ENABLE_LZMA
/* lzma without -d or -t? */
if (applet_name[2] == 'm' && !(opts & (BBUNPK_OPT_DECOMPRESS|BBUNPK_OPT_TEST)))
bb_show_usage();
# endif
/* lzcat? */
if (ENABLE_LZCAT && applet_name[2] == 'c')
option_mask32 |= BBUNPK_OPT_STDOUT;
argv += optind;
return bbunpack(argv, unpack_lzma_stream, make_new_name_generic, "lzma");
}
#endif
//usage:#define unxz_trivial_usage
//usage: "[-cfk] [FILE]..."
//usage:#define unxz_full_usage "\n\n"
//usage: "Decompress FILEs (or stdin)\n"
//usage: "\n -c Write to stdout"
//usage: "\n -f Force"
//usage: "\n -k Keep input files"
//usage: "\n -t Test integrity"
//usage:
//usage:#define xz_trivial_usage
//usage: "-d [-cfk] [FILE]..."
//usage:#define xz_full_usage "\n\n"
//usage: "Decompress FILEs (or stdin)\n"
//usage: "\n -d Decompress"
//usage: "\n -c Write to stdout"
//usage: "\n -f Force"
//usage: "\n -k Keep input files"
//usage: "\n -t Test integrity"
//usage:
//usage:#define xzcat_trivial_usage
//usage: "[FILE]..."
//usage:#define xzcat_full_usage "\n\n"
//usage: "Decompress to stdout"
//config:config UNXZ
//config: bool "unxz (13 kb)"
//config: default y
//config: help
//config: unxz is a unlzma successor.
//config:
//config:config XZCAT
//config: bool "xzcat (13 kb)"
//config: default y
//config: help
//config: Alias to "unxz -c".
//config:
//config:config XZ
//config: bool "xz -d"
//config: default y
//config: help
//config: Enable this option if you want commands like "xz -d" to work.
//config: IOW: you'll get xz applet, but it will always require -d option.
//applet:IF_UNXZ(APPLET(unxz, BB_DIR_USR_BIN, BB_SUID_DROP))
// APPLET_ODDNAME:name main location suid_type help
//applet:IF_XZCAT(APPLET_ODDNAME(xzcat, unxz, BB_DIR_USR_BIN, BB_SUID_DROP, xzcat))
//applet:IF_XZ( APPLET_ODDNAME(xz, unxz, BB_DIR_USR_BIN, BB_SUID_DROP, xz))
//kbuild:lib-$(CONFIG_UNXZ) += bbunzip.o
//kbuild:lib-$(CONFIG_XZCAT) += bbunzip.o
//kbuild:lib-$(CONFIG_XZ) += bbunzip.o
#if ENABLE_UNXZ || ENABLE_XZCAT || ENABLE_XZ
int unxz_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int unxz_main(int argc UNUSED_PARAM, char **argv)
{
IF_XZ(int opts =) getopt32(argv, BBUNPK_OPTSTR "dt");
# if ENABLE_XZ
/* xz without -d or -t? */
if (applet_name[2] == '\0' && !(opts & (BBUNPK_OPT_DECOMPRESS|BBUNPK_OPT_TEST)))
bb_show_usage();
# endif
/* xzcat? */
if (ENABLE_XZCAT && applet_name[2] == 'c')
option_mask32 |= BBUNPK_OPT_STDOUT;
argv += optind;
return bbunpack(argv, unpack_xz_stream, make_new_name_generic, "xz");
}
#endif

View File

@ -0,0 +1,61 @@
#!/bin/sh
# Test that concatenated gz files are unpacking correctly.
# It also tests that unpacking in general is working right.
# Since zip code has many corner cases, run it for a few hours
# to get a decent coverage (200000 tests or more).
gzip="gzip"
gunzip="../busybox gunzip"
# Or the other way around:
#gzip="../busybox gzip"
#gunzip="gunzip"
c=0
i=$PID
while true; do
c=$((c+1))
# RANDOM is not very random on some shells. Spice it up.
# 100003 is prime
len1=$(( (((RANDOM*RANDOM)^i) & 0x7ffffff) % 100003 ))
i=$((i * 1664525 + 1013904223))
len2=$(( (((RANDOM*RANDOM)^i) & 0x7ffffff) % 100003 ))
# Just using urandom will make gzip use method 0 (store) -
# not good for test coverage!
cat /dev/urandom | while true; do read junk; echo "junk $c $i $junk"; done \
| dd bs=$len1 count=1 >z1 2>/dev/null
cat /dev/urandom | while true; do read junk; echo "junk $c $i $junk"; done \
| dd bs=$len2 count=1 >z2 2>/dev/null
$gzip <z1 >zz.gz
$gzip <z2 >>zz.gz
$gunzip -c zz.gz >z9 || {
echo "Exitcode $?"
exit
}
sum=`cat z1 z2 | md5sum`
sum9=`md5sum <z9`
test "$sum" == "$sum9" || {
echo "md5sums don't match"
exit
}
echo "Test $c ok: len1=$len1 len2=$len2 sum=$sum"
sum=`cat z1 z2 z1 z2 | md5sum`
rm z1.gz z2.gz 2>/dev/null
$gzip z1
$gzip z2
cat z1.gz z2.gz z1.gz z2.gz >zz.gz
$gunzip -c zz.gz >z9 || {
echo "Exitcode $? (2)"
exit
}
sum9=`md5sum <z9`
test "$sum" == "$sum9" || {
echo "md5sums don't match (1)"
exit
}
echo "Test $c ok: len1=$len1 len2=$len2 sum=$sum (2)"
done

View File

@ -0,0 +1,10 @@
#!/bin/sh
# Leak test for gunzip. Watch top for growing process size.
# Just using urandom will make gzip use method 0 (store) -
# not good for test coverage!
cat /dev/urandom \
| while true; do read junk; echo "junk $RANDOM $junk"; done \
| ../busybox gzip \
| ../busybox gunzip -c >/dev/null

View File

@ -0,0 +1,23 @@
#!/bin/sh
# Leak test for gunzip. Watch top for growing process size.
# In this case we look for leaks in "concatenated .gz" code -
# we feed gunzip with a stream of .gz files.
i=$PID
c=0
while true; do
c=$((c + 1))
echo "Block# $c" >&2
# RANDOM is not very random on some shells. Spice it up.
i=$((i * 1664525 + 1013904223))
# 100003 is prime
len=$(( (((RANDOM*RANDOM)^i) & 0x7ffffff) % 100003 ))
# Just using urandom will make gzip use method 0 (store) -
# not good for test coverage!
cat /dev/urandom \
| while true; do read junk; echo "junk $c $i $junk"; done \
| dd bs=$len count=1 2>/dev/null \
| gzip >xxx.gz
cat xxx.gz xxx.gz xxx.gz xxx.gz xxx.gz xxx.gz xxx.gz xxx.gz
done | ../busybox gunzip -c >/dev/null

View File

@ -0,0 +1,247 @@
/*
* Copyright (C) 2007 Denys Vlasenko <vda.linux@googlemail.com>
*
* This file uses bzip2 library code which is written
* by Julian Seward <jseward@bzip.org>.
* See README and LICENSE files in bz/ directory for more information
* about bzip2 library code.
*/
//config:config BZIP2
//config: bool "bzip2 (16 kb)"
//config: default y
//config: help
//config: bzip2 is a compression utility using the Burrows-Wheeler block
//config: sorting text compression algorithm, and Huffman coding. Compression
//config: is generally considerably better than that achieved by more
//config: conventional LZ77/LZ78-based compressors, and approaches the
//config: performance of the PPM family of statistical compressors.
//config:
//config: Unless you have a specific application which requires bzip2, you
//config: should probably say N here.
//config:
//config:config BZIP2_SMALL
//config: int "Trade bytes for speed (0:fast, 9:small)"
//config: default 8 # all "fast or small" options default to small
//config: range 0 9
//config: depends on BZIP2
//config: help
//config: Trade code size versus speed.
//config: Approximate values with gcc-6.3.0 "bzip -9" compressing
//config: linux-4.15.tar were:
//config: value time (sec) code size (386)
//config: 9 (smallest) 70.11 7687
//config: 8 67.93 8091
//config: 7 67.88 8405
//config: 6 67.78 8624
//config: 5 67.05 9427
//config: 4-0 (fastest) 64.14 12083
//config:
//config:config FEATURE_BZIP2_DECOMPRESS
//config: bool "Enable decompression"
//config: default y
//config: depends on BZIP2 || BUNZIP2 || BZCAT
//config: help
//config: Enable -d (--decompress) and -t (--test) options for bzip2.
//config: This will be automatically selected if bunzip2 or bzcat is
//config: enabled.
//applet:IF_BZIP2(APPLET(bzip2, BB_DIR_USR_BIN, BB_SUID_DROP))
//kbuild:lib-$(CONFIG_BZIP2) += bzip2.o
//usage:#define bzip2_trivial_usage
//usage: "[-cfk" IF_FEATURE_BZIP2_DECOMPRESS("dt") "123456789] [FILE]..."
//usage:#define bzip2_full_usage "\n\n"
//usage: "Compress FILEs (or stdin) with bzip2 algorithm\n"
//usage: "\n -1..9 Compression level"
//usage: IF_FEATURE_BZIP2_DECOMPRESS(
//usage: "\n -d Decompress"
//usage: )
//usage: "\n -c Write to stdout"
//usage: "\n -f Force"
//usage: "\n -k Keep input files"
//usage: IF_FEATURE_BZIP2_DECOMPRESS(
//usage: "\n -t Test integrity"
//usage: )
#include "libbb.h"
#include "bb_archive.h"
#if CONFIG_BZIP2_SMALL >= 4
#define BZIP2_SPEED (9 - CONFIG_BZIP2_SMALL)
#else
#define BZIP2_SPEED 5
#endif
/* Speed test:
* Compiled with gcc 4.2.1, run on Athlon 64 1800 MHz (512K L2 cache).
* Stock bzip2 is 26.4% slower than bbox bzip2 at SPEED 1
* (time to compress gcc-4.2.1.tar is 126.4% compared to bbox).
* At SPEED 5 difference is 32.7%.
*
* Test run of all BZIP2_SPEED values on a 11Mb text file:
* Size Time (3 runs)
* 0: 10828 4.145 4.146 4.148
* 1: 11097 3.845 3.860 3.861
* 2: 11392 3.763 3.767 3.768
* 3: 11892 3.722 3.724 3.727
* 4: 12740 3.637 3.640 3.644
* 5: 17273 3.497 3.509 3.509
*/
#define BZ_DEBUG 0
/* Takes ~300 bytes, detects corruption caused by bad RAM etc */
#define BZ_LIGHT_DEBUG 0
#include "libarchive/bz/bzlib.h"
#include "libarchive/bz/bzlib_private.h"
#include "libarchive/bz/blocksort.c"
#include "libarchive/bz/bzlib.c"
#include "libarchive/bz/compress.c"
#include "libarchive/bz/huffman.c"
/* No point in being shy and having very small buffer here.
* bzip2 internal buffers are much bigger anyway, hundreds of kbytes.
* If iobuf is several pages long, malloc() may use mmap,
* making iobuf page aligned and thus (maybe) have one memcpy less
* if kernel is clever enough.
*/
enum {
IOBUF_SIZE = 8 * 1024
};
/* NB: compressStream() has to return -1 on errors, not die.
* bbunpack() will correctly clean up in this case
* (delete incomplete .bz2 file)
*/
/* Returns:
* -1 on errors
* total written bytes so far otherwise
*/
static
IF_DESKTOP(long long) int bz_write(bz_stream *strm, void* rbuf, ssize_t rlen, void *wbuf)
{
int n, n2, ret;
strm->avail_in = rlen;
strm->next_in = rbuf;
while (1) {
strm->avail_out = IOBUF_SIZE;
strm->next_out = wbuf;
ret = BZ2_bzCompress(strm, rlen ? BZ_RUN : BZ_FINISH);
if (ret != BZ_RUN_OK /* BZ_RUNning */
&& ret != BZ_FINISH_OK /* BZ_FINISHing, but not done yet */
&& ret != BZ_STREAM_END /* BZ_FINISHed */
) {
bb_error_msg_and_die("internal error %d", ret);
}
n = IOBUF_SIZE - strm->avail_out;
if (n) {
n2 = full_write(STDOUT_FILENO, wbuf, n);
if (n2 != n) {
if (n2 >= 0)
errno = 0; /* prevent bogus error message */
bb_simple_perror_msg(n2 >= 0 ? "short write" : bb_msg_write_error);
return -1;
}
}
if (ret == BZ_STREAM_END)
break;
if (rlen && strm->avail_in == 0)
break;
}
return 0 IF_DESKTOP( + strm->total_out );
}
static
IF_DESKTOP(long long) int FAST_FUNC compressStream(transformer_state_t *xstate UNUSED_PARAM)
{
IF_DESKTOP(long long) int total;
unsigned opt, level;
ssize_t count;
bz_stream bzs; /* it's small */
#define strm (&bzs)
char *iobuf;
#define rbuf iobuf
#define wbuf (iobuf + IOBUF_SIZE)
iobuf = xmalloc(2 * IOBUF_SIZE);
opt = option_mask32 >> (BBUNPK_OPTSTRLEN IF_FEATURE_BZIP2_DECOMPRESS(+ 2) + 2);
/* skipped BBUNPK_OPTSTR, "dt" and "zs" bits */
opt |= 0x100; /* if nothing else, assume -9 */
level = 0;
for (;;) {
level++;
if (opt & 1) break;
opt >>= 1;
}
BZ2_bzCompressInit(strm, level);
while (1) {
count = full_read(STDIN_FILENO, rbuf, IOBUF_SIZE);
if (count < 0) {
bb_simple_perror_msg(bb_msg_read_error);
total = -1;
break;
}
/* if count == 0, bz_write finalizes compression */
total = bz_write(strm, rbuf, count, wbuf);
if (count == 0 || total < 0)
break;
}
/* Can't be conditional on ENABLE_FEATURE_CLEAN_UP -
* we are called repeatedly
*/
BZ2_bzCompressEnd(strm);
free(iobuf);
return total;
}
int bzip2_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int bzip2_main(int argc UNUSED_PARAM, char **argv)
{
unsigned opt;
/* standard bzip2 flags
* -d --decompress force decompression
* -z --compress force compression
* -k --keep keep (don't delete) input files
* -f --force overwrite existing output files
* -t --test test compressed file integrity
* -c --stdout output to standard out
* -q --quiet suppress noncritical error messages
* -v --verbose be verbose (a 2nd -v gives more)
* -s --small use less memory (at most 2500k)
* -1 .. -9 set block size to 100k .. 900k
* --fast alias for -1
* --best alias for -9
*/
opt = getopt32(argv, "^"
/* Must match BBUNPK_foo constants! */
BBUNPK_OPTSTR IF_FEATURE_BZIP2_DECOMPRESS("dt") "zs123456789"
"\0" "s2" /* -s means -2 (compatibility) */
);
#if ENABLE_FEATURE_BZIP2_DECOMPRESS /* bunzip2_main may not be visible... */
if (opt & (BBUNPK_OPT_DECOMPRESS|BBUNPK_OPT_TEST)) /* -d and/or -t */
return bunzip2_main(argc, argv);
#else
/* clear "decompress" and "test" bits (or bbunpack() can get confused) */
/* in !BZIP2_DECOMPRESS config, these bits are -zs and are unused */
option_mask32 = opt & ~(BBUNPK_OPT_DECOMPRESS|BBUNPK_OPT_TEST);
#endif
argv += optind;
return bbunpack(argv, compressStream, append_ext, "bz2");
}

View File

@ -0,0 +1,35 @@
/*
* Copyright (C) 2021 Denys Vlasenko <vda.linux@googlemail.com>
*
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
//kbuild:lib-$(CONFIG_FEATURE_TAR_CREATE) += chksum_and_xwrite_tar_header.o
//kbuild:lib-$(CONFIG_SMEMCAP) += chksum_and_xwrite_tar_header.o
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC chksum_and_xwrite_tar_header(int fd, struct tar_header_t *hp)
{
/* POSIX says that checksum is done on unsigned bytes
* (Sun and HP-UX gets it wrong... more details in
* GNU tar source) */
const unsigned char *cp;
unsigned int chksum, size;
strcpy(hp->magic, "ustar ");
/* Calculate and store the checksum (the sum of all of the bytes of
* the header). The checksum field must be filled with blanks for the
* calculation. The checksum field is formatted differently from the
* other fields: it has 6 digits, a NUL, then a space -- rather than
* digits, followed by a NUL like the other fields... */
memset(hp->chksum, ' ', sizeof(hp->chksum));
cp = (const unsigned char *) hp;
chksum = 0;
size = sizeof(*hp);
do { chksum += *cp++; } while (--size);
sprintf(hp->chksum, "%06o", chksum);
xwrite(fd, hp, sizeof(*hp));
}

View File

@ -0,0 +1,581 @@
/* vi: set sw=4 ts=4: */
/*
* Mini cpio implementation for busybox
*
* Copyright (C) 2001 by Glenn McGrath
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*
* Limitations:
* Doesn't check CRC's
* Only supports new ASCII and CRC formats
*/
//config:config CPIO
//config: bool "cpio (15 kb)"
//config: default y
//config: help
//config: cpio is an archival utility program used to create, modify, and
//config: extract contents from archives.
//config: cpio has 110 bytes of overheads for every stored file.
//config:
//config: This implementation of cpio can extract cpio archives created in the
//config: "newc" or "crc" format.
//config:
//config: Unless you have a specific application which requires cpio, you
//config: should probably say N here.
//config:
//config:config FEATURE_CPIO_O
//config: bool "Support archive creation"
//config: default y
//config: depends on CPIO
//config: help
//config: This implementation of cpio can create cpio archives in the "newc"
//config: format only.
//config:
//config:config FEATURE_CPIO_P
//config: bool "Support passthrough mode"
//config: default y
//config: depends on FEATURE_CPIO_O
//config: help
//config: Passthrough mode. Rarely used.
//config:
//config:config FEATURE_CPIO_IGNORE_DEVNO
//config: bool "Support --ignore-devno like GNU cpio"
//config: default y
//config: depends on FEATURE_CPIO_O && LONG_OPTS
//config: help
//config: Optionally ignore device numbers when creating archives.
//config:
//config:config FEATURE_CPIO_RENUMBER_INODES
//config: bool "Support --renumber-inodes like GNU cpio"
//config: default y
//config: depends on FEATURE_CPIO_O && LONG_OPTS
//config: help
//config: Optionally renumber inodes when creating archives.
//applet:IF_CPIO(APPLET(cpio, BB_DIR_BIN, BB_SUID_DROP))
//kbuild:lib-$(CONFIG_CPIO) += cpio.o
//usage:#define cpio_trivial_usage
//usage: "[-dmvu] [-F FILE] [-R USER[:GRP]]" IF_FEATURE_CPIO_O(" [-H newc]")
//usage: " [-ti"IF_FEATURE_CPIO_O("o")"]" IF_FEATURE_CPIO_P(" [-p DIR]")
//usage: " [EXTR_FILE]..."
//usage:#define cpio_full_usage "\n\n"
//usage: "Extract (-i) or list (-t) files from a cpio archive on stdin"
//usage: IF_FEATURE_CPIO_O(", or"
//usage: "\ntake file list from stdin and create an archive (-o)"
//usage: IF_FEATURE_CPIO_P(" or copy files (-p)")
//usage: )
//usage: "\n"
//usage: "\nMain operation mode:"
//usage: "\n -t List"
//usage: "\n -i Extract EXTR_FILEs (or all)"
//usage: IF_FEATURE_CPIO_O(
//usage: "\n -o Create (requires -H newc)"
//usage: )
//usage: IF_FEATURE_CPIO_P(
//usage: "\n -p DIR Copy files to DIR"
//usage: )
//usage: "\nOptions:"
//usage: IF_FEATURE_CPIO_O(
//usage: "\n -H newc Archive format"
//usage: )
//usage: "\n -d Make leading directories"
//usage: "\n -m Restore mtime"
//usage: "\n -v Verbose"
//usage: "\n -u Overwrite"
//usage: "\n -F FILE Input (-t,-i,-p) or output (-o) file"
//usage: "\n -R USER[:GRP] Set owner of created files"
//usage: "\n -L Dereference symlinks"
//usage: "\n -0 NUL terminated input"
//usage: IF_FEATURE_CPIO_IGNORE_DEVNO(
//usage: "\n --ignore-devno"
//usage: )
//usage: IF_FEATURE_CPIO_RENUMBER_INODES(
//usage: "\n --renumber-inodes"
//usage: )
/* GNU cpio 2.9 --help (abridged):
Modes:
-t, --list List the archive
-i, --extract Extract files from an archive
-o, --create Create the archive
-p, --pass-through Copy-pass mode
Options valid in any mode:
--block-size=SIZE I/O block size = SIZE * 512 bytes
-B I/O block size = 5120 bytes
-c Use the old portable (ASCII) archive format
-C, --io-size=NUMBER I/O block size in bytes
-f, --nonmatching Only copy files that do not match given pattern
-F, --file=FILE Use FILE instead of standard input or output
-H, --format=FORMAT Use given archive FORMAT
-M, --message=STRING Print STRING when the end of a volume of the
backup media is reached
-n, --numeric-uid-gid If -v, show numeric UID and GID
--quiet Do not print the number of blocks copied
--rsh-command=COMMAND Use remote COMMAND instead of rsh
-v, --verbose Verbosely list the files processed
-V, --dot Print a "." for each file processed
-W, --warning=FLAG Control warning display: 'none','truncate','all';
multiple options accumulate
Options valid only in --extract mode:
-b, --swap Swap both halfwords of words and bytes of
halfwords in the data (equivalent to -sS)
-r, --rename Interactively rename files
-s, --swap-bytes Swap the bytes of each halfword in the files
-S, --swap-halfwords Swap the halfwords of each word (4 bytes)
--to-stdout Extract files to standard output
-E, --pattern-file=FILE Read additional patterns specifying filenames to
extract or list from FILE
--only-verify-crc Verify CRC's, don't actually extract the files
Options valid only in --create mode:
-A, --append Append to an existing archive
-O FILE File to use instead of standard output
Options valid only in --pass-through mode:
-l, --link Link files instead of copying them, when possible
Options valid in --extract and --create modes:
--absolute-filenames Do not strip file system prefix components from
the file names
--no-absolute-filenames Create all files relative to the current dir
Options valid in --create and --pass-through modes:
-0, --null A list of filenames is terminated by a NUL
-a, --reset-access-time Reset the access times of files after reading them
-I FILE File to use instead of standard input
-L, --dereference Dereference symbolic links (copy the files
that they point to instead of copying the links)
-R, --owner=[USER][:.][GRP] Set owner of created files
Options valid in --extract and --pass-through modes:
-d, --make-directories Create leading directories where needed
-m, --preserve-modification-time Retain mtime when creating files
--no-preserve-owner Do not change the ownership of the files
--sparse Write files with blocks of zeros as sparse files
-u, --unconditional Replace all files unconditionally
*/
#include "libbb.h"
#include "common_bufsiz.h"
#include "bb_archive.h"
enum {
OPT_EXTRACT = (1 << 0),
OPT_TEST = (1 << 1),
OPT_NUL_TERMINATED = (1 << 2),
OPT_UNCONDITIONAL = (1 << 3),
OPT_VERBOSE = (1 << 4),
OPT_CREATE_LEADING_DIR = (1 << 5),
OPT_PRESERVE_MTIME = (1 << 6),
OPT_DEREF = (1 << 7),
OPT_FILE = (1 << 8),
OPT_OWNER = (1 << 9),
OPTBIT_OWNER = 9,
IF_FEATURE_CPIO_O(OPTBIT_CREATE ,)
IF_FEATURE_CPIO_O(OPTBIT_FORMAT ,)
IF_FEATURE_CPIO_P(OPTBIT_PASSTHROUGH,)
IF_LONG_OPTS( OPTBIT_QUIET ,)
IF_LONG_OPTS( OPTBIT_2STDOUT ,)
IF_FEATURE_CPIO_IGNORE_DEVNO(OPTBIT_IGNORE_DEVNO,)
IF_FEATURE_CPIO_RENUMBER_INODES(OPTBIT_RENUMBER_INODES,)
OPT_CREATE = IF_FEATURE_CPIO_O((1 << OPTBIT_CREATE )) + 0,
OPT_FORMAT = IF_FEATURE_CPIO_O((1 << OPTBIT_FORMAT )) + 0,
OPT_PASSTHROUGH = IF_FEATURE_CPIO_P((1 << OPTBIT_PASSTHROUGH)) + 0,
OPT_QUIET = IF_LONG_OPTS( (1 << OPTBIT_QUIET )) + 0,
OPT_2STDOUT = IF_LONG_OPTS( (1 << OPTBIT_2STDOUT )) + 0,
OPT_IGNORE_DEVNO = IF_FEATURE_CPIO_IGNORE_DEVNO((1 << OPTBIT_IGNORE_DEVNO)) + 0,
OPT_RENUMBER_INODES = IF_FEATURE_CPIO_RENUMBER_INODES((1 << OPTBIT_RENUMBER_INODES)) + 0,
};
#define OPTION_STR "it0uvdmLF:R:"
struct globals {
struct bb_uidgid_t owner_ugid;
ino_t next_inode;
} FIX_ALIASING;
#define G (*(struct globals*)bb_common_bufsiz1)
void BUG_cpio_globals_too_big(void);
#define INIT_G() do { \
setup_common_bufsiz(); \
G.owner_ugid.uid = -1L; \
G.owner_ugid.gid = -1L; \
} while (0)
#if ENABLE_FEATURE_CPIO_O
static off_t cpio_pad4(off_t size)
{
int i;
i = (- size) & 3;
size += i;
while (--i >= 0)
bb_putchar('\0');
return size;
}
/* Return value will become exit code.
* It's ok to exit instead of return. */
static NOINLINE int cpio_o(void)
{
struct name_s {
struct name_s *next;
char name[1];
};
struct inodes_s {
struct inodes_s *next;
struct name_s *names;
struct stat st;
#if ENABLE_FEATURE_CPIO_RENUMBER_INODES
ino_t mapped_inode;
#endif
};
struct inodes_s *links = NULL;
off_t bytes = 0; /* output bytes count */
while (1) {
const char *name;
char *line;
struct stat st;
line = (option_mask32 & OPT_NUL_TERMINATED)
? bb_get_chunk_from_file(stdin, NULL)
: xmalloc_fgetline(stdin);
if (line) {
/* Strip leading "./[./]..." from the filename */
name = line;
while (name[0] == '.' && name[1] == '/') {
while (*++name == '/')
continue;
}
if (!*name) { /* line is empty */
free(line);
continue;
}
if ((option_mask32 & OPT_DEREF)
? stat(name, &st)
: lstat(name, &st)
) {
abort_cpio_o:
bb_simple_perror_msg_and_die(name);
}
if (G.owner_ugid.uid != (uid_t)-1L)
st.st_uid = G.owner_ugid.uid;
if (G.owner_ugid.gid != (gid_t)-1L)
st.st_gid = G.owner_ugid.gid;
if (!(S_ISLNK(st.st_mode) || S_ISREG(st.st_mode)))
st.st_size = 0; /* paranoia */
/* Store hardlinks for later processing, dont output them */
if (!S_ISDIR(st.st_mode) && st.st_nlink > 1) {
struct name_s *n;
struct inodes_s *l;
/* Do we have this hardlink remembered? */
l = links;
while (1) {
if (l == NULL) {
/* Not found: add new item to "links" list */
l = xzalloc(sizeof(*l));
l->st = st;
l->next = links;
#if ENABLE_FEATURE_CPIO_RENUMBER_INODES
if (option_mask32 & OPT_RENUMBER_INODES)
l->mapped_inode = ++G.next_inode;
#endif
links = l;
break;
}
if (l->st.st_ino == st.st_ino) {
/* found */
break;
}
l = l->next;
}
/* Add new name to "l->names" list */
n = xmalloc(sizeof(*n) + strlen(name));
strcpy(n->name, name);
n->next = l->names;
l->names = n;
free(line);
continue;
}
#if ENABLE_FEATURE_CPIO_RENUMBER_INODES
else if (option_mask32 & OPT_RENUMBER_INODES) {
st.st_ino = ++G.next_inode;
}
#endif
} else { /* line == NULL: EOF */
next_link:
if (links) {
/* Output hardlink's data */
st = links->st;
name = links->names->name;
links->names = links->names->next;
#if ENABLE_FEATURE_CPIO_RENUMBER_INODES
if (links->mapped_inode)
st.st_ino = links->mapped_inode;
#endif
/* GNU cpio is reported to emit file data
* only for the last instance. Mimic that. */
if (links->names == NULL)
links = links->next;
else
st.st_size = 0;
/* NB: we leak links->names and/or links,
* this is intended (we exit soon anyway) */
} else {
/* If no (more) hardlinks to output,
* output "trailer" entry */
name = cpio_TRAILER;
/* st.st_size == 0 is a must, but for uniformity
* in the output, we zero out everything */
memset(&st, 0, sizeof(st));
/* st.st_nlink = 1; - GNU cpio does this */
}
}
#if ENABLE_FEATURE_CPIO_IGNORE_DEVNO
if (option_mask32 & OPT_IGNORE_DEVNO)
st.st_dev = st.st_rdev = 0;
#endif
bytes += printf("070701"
"%08X%08X%08X%08X%08X%08X%08X"
"%08X%08X%08X%08X" /* GNU cpio uses uppercase hex */
/* strlen+1: */ "%08X"
/* chksum: */ "00000000" /* (only for "070702" files) */
/* name,NUL: */ "%s%c",
(unsigned)(uint32_t) st.st_ino,
(unsigned)(uint32_t) st.st_mode,
(unsigned)(uint32_t) st.st_uid,
(unsigned)(uint32_t) st.st_gid,
(unsigned)(uint32_t) st.st_nlink,
(unsigned)(uint32_t) st.st_mtime,
(unsigned)(uint32_t) st.st_size,
(unsigned)(uint32_t) major(st.st_dev),
(unsigned)(uint32_t) minor(st.st_dev),
(unsigned)(uint32_t) major(st.st_rdev),
(unsigned)(uint32_t) minor(st.st_rdev),
(unsigned)(strlen(name) + 1),
name, '\0');
bytes = cpio_pad4(bytes);
if (st.st_size) {
if (S_ISLNK(st.st_mode)) {
char *lpath = xmalloc_readlink_or_warn(name);
if (!lpath)
goto abort_cpio_o;
bytes += printf("%s", lpath);
free(lpath);
} else { /* S_ISREG */
int fd = xopen(name, O_RDONLY);
fflush_all();
/* We must abort if file got shorter too! */
bb_copyfd_exact_size(fd, STDOUT_FILENO, st.st_size);
bytes += st.st_size;
close(fd);
}
bytes = cpio_pad4(bytes);
}
if (!line) {
if (name != cpio_TRAILER)
goto next_link;
/* TODO: GNU cpio pads trailer to 512 bytes, do we want that? */
return EXIT_SUCCESS;
}
free(line);
} /* end of "while (1)" */
}
#endif
int cpio_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int cpio_main(int argc UNUSED_PARAM, char **argv)
{
archive_handle_t *archive_handle;
char *cpio_filename;
char *cpio_owner;
IF_FEATURE_CPIO_O(const char *cpio_fmt = "";)
unsigned opt;
#if ENABLE_LONG_OPTS
const char *long_opts =
"extract\0" No_argument "i"
"list\0" No_argument "t"
#if ENABLE_FEATURE_CPIO_O
"create\0" No_argument "o"
"format\0" Required_argument "H"
#if ENABLE_FEATURE_CPIO_P
"pass-through\0" No_argument "p"
#endif
#endif
"owner\0" Required_argument "R"
"verbose\0" No_argument "v"
"null\0" No_argument "0"
"quiet\0" No_argument "\xff"
"to-stdout\0" No_argument "\xfe"
#if ENABLE_FEATURE_CPIO_IGNORE_DEVNO
"ignore-devno\0" No_argument "\xfd"
#endif
#if ENABLE_FEATURE_CPIO_RENUMBER_INODES
"renumber-inodes\0" No_argument "\xfc"
#endif
;
#endif
INIT_G();
archive_handle = init_handle();
/* archive_handle->src_fd = STDIN_FILENO; - done by init_handle */
archive_handle->ah_flags = ARCHIVE_EXTRACT_NEWER;
/* As of now we do not enforce this: */
/* -i,-t,-o,-p are mutually exclusive */
/* -u,-d,-m make sense only with -i or -p */
/* -L makes sense only with -o or -p */
#if !ENABLE_FEATURE_CPIO_O
opt = getopt32long(argv, OPTION_STR, long_opts, &cpio_filename, &cpio_owner);
#else
opt = getopt32long(argv, OPTION_STR "oH:" IF_FEATURE_CPIO_P("p"), long_opts,
&cpio_filename, &cpio_owner, &cpio_fmt);
#endif
argv += optind;
if (opt & OPT_OWNER) { /* -R */
parse_chown_usergroup_or_die(&G.owner_ugid, cpio_owner);
archive_handle->cpio__owner = G.owner_ugid;
}
#if !ENABLE_FEATURE_CPIO_O
if (opt & OPT_FILE) { /* -F */
xmove_fd(xopen(cpio_filename, O_RDONLY), STDIN_FILENO);
}
#else
if ((opt & (OPT_FILE|OPT_CREATE)) == OPT_FILE) { /* -F without -o */
xmove_fd(xopen(cpio_filename, O_RDONLY), STDIN_FILENO);
}
if (opt & OPT_PASSTHROUGH) {
pid_t pid;
struct fd_pair pp;
if (argv[0] == NULL)
bb_show_usage();
if (opt & OPT_CREATE_LEADING_DIR)
/* GNU cpio 2.13: "cpio -d -p a/b/c" works */
bb_make_directory(argv[0], -1, FILEUTILS_RECUR);
/* Crude existence check:
* close(xopen(argv[0], O_RDONLY | O_DIRECTORY));
* We can also xopen, fstat, IS_DIR, later fchdir.
* This would check for existence earlier and cleaner.
* As it stands now, if we fail xchdir later,
* child dies on EPIPE, unless it caught
* a diffrerent problem earlier.
* This is good enough for now.
*/
//FIXME: GNU cpio -d -p DIR does not immediately create DIR -
//it just prepends "DIR/" to the names of files to be created.
//The first file (fails to) be copied, and then the -d logic
//triggers and creates all necessary directories.
//IOW: bare "cpio -d -p DIR" + ^C shouldn't create anything.
#if !BB_MMU
pp.rd = 3;
pp.wr = 4;
if (!re_execed) {
close(3);
close(4);
xpiped_pair(pp);
}
#else
xpiped_pair(pp);
#endif
pid = fork_or_rexec(argv - optind);
if (pid == 0) { /* child */
close(pp.rd);
xmove_fd(pp.wr, STDOUT_FILENO);
goto dump;
}
/* parent */
xchdir(*argv++);
close(pp.wr);
xmove_fd(pp.rd, STDIN_FILENO);
//opt &= ~OPT_PASSTHROUGH;
opt |= OPT_EXTRACT;
goto skip;
}
/* -o */
if (opt & OPT_CREATE) {
if (cpio_fmt[0] != 'n') /* we _require_ "-H newc" */
bb_show_usage();
if (opt & OPT_FILE) {
xmove_fd(xopen(cpio_filename, O_WRONLY | O_CREAT | O_TRUNC), STDOUT_FILENO);
}
dump:
return cpio_o();
}
skip:
#endif
/* One of either extract or test options must be given */
if ((opt & (OPT_TEST | OPT_EXTRACT)) == 0) {
bb_show_usage();
}
if (opt & OPT_TEST) {
/* if both extract and test options are given, ignore extract option */
opt &= ~OPT_EXTRACT;
archive_handle->action_header = header_list;
}
if (opt & OPT_EXTRACT) {
archive_handle->action_data = data_extract_all;
if (opt & OPT_2STDOUT)
archive_handle->action_data = data_extract_to_stdout;
}
if (opt & OPT_UNCONDITIONAL) {
archive_handle->ah_flags |= ARCHIVE_UNLINK_OLD;
archive_handle->ah_flags &= ~ARCHIVE_EXTRACT_NEWER;
}
if (opt & OPT_VERBOSE) {
if (archive_handle->action_header == header_list) {
archive_handle->action_header = header_verbose_list;
} else {
archive_handle->action_header = header_list;
}
}
if (opt & OPT_CREATE_LEADING_DIR) {
archive_handle->ah_flags |= ARCHIVE_CREATE_LEADING_DIRS;
}
if (opt & OPT_PRESERVE_MTIME) {
archive_handle->ah_flags |= ARCHIVE_RESTORE_DATE;
}
while (*argv) {
archive_handle->filter = filter_accept_list;
llist_add_to(&archive_handle->accept, *argv);
argv++;
}
/* see get_header_cpio */
archive_handle->cpio__blocks = (off_t)-1;
while (get_header_cpio(archive_handle) == EXIT_SUCCESS)
continue;
create_links_from_list(archive_handle->link_placeholders);
if (archive_handle->cpio__blocks != (off_t)-1
&& !(opt & OPT_QUIET)
) {
fflush_all();
fprintf(stderr, "%"OFF_FMT"u blocks\n", archive_handle->cpio__blocks);
}
return EXIT_SUCCESS;
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,134 @@
/* vi: set sw=4 ts=4: */
/*
* dpkg-deb packs, unpacks and provides information about Debian archives.
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
//config:config DPKG_DEB
//config: bool "dpkg-deb (29 kb)"
//config: default y
//config: select FEATURE_SEAMLESS_GZ
//config: help
//config: dpkg-deb unpacks and provides information about Debian archives.
//config:
//config: This implementation of dpkg-deb cannot pack archives.
//config:
//config: Unless you have a specific application which requires dpkg-deb,
//config: say N here.
//applet:IF_DPKG_DEB(APPLET_ODDNAME(dpkg-deb, dpkg_deb, BB_DIR_USR_BIN, BB_SUID_DROP, dpkg_deb))
//kbuild:lib-$(CONFIG_DPKG_DEB) += dpkg_deb.o
//usage:#define dpkg_deb_trivial_usage
//usage: "[-cefxX] FILE [DIR]"
//usage:#define dpkg_deb_full_usage "\n\n"
//usage: "Perform actions on Debian packages (.deb)\n"
//usage: "\n -c List files"
//usage: "\n -f Print control fields"
//usage: "\n -e Extract control files to DIR (default: ./DEBIAN)"
//usage: "\n -x Extract files to DIR (no default)"
//usage: "\n -X Verbose extract"
//usage:
//usage:#define dpkg_deb_example_usage
//usage: "$ dpkg-deb -X ./busybox_0.48-1_i386.deb /tmp\n"
#include "libbb.h"
#include "bb_archive.h"
#define DPKG_DEB_OPT_CONTENTS 1
#define DPKG_DEB_OPT_CONTROL 2
#define DPKG_DEB_OPT_FIELD 4
#define DPKG_DEB_OPT_EXTRACT_VERBOSE 8
#define DPKG_DEB_OPT_EXTRACT 16
int dpkg_deb_main(int argc, char **argv) MAIN_EXTERNALLY_VISIBLE;
int dpkg_deb_main(int argc UNUSED_PARAM, char **argv)
{
archive_handle_t *ar_archive;
archive_handle_t *tar_archive;
llist_t *control_tar_llist = NULL;
unsigned opt;
const char *extract_dir;
/* Setup the tar archive handle */
tar_archive = init_handle();
/* Setup an ar archive handle that refers to the gzip sub archive */
ar_archive = init_handle();
ar_archive->dpkg__sub_archive = tar_archive;
ar_archive->filter = filter_accept_list_reassign;
llist_add_to(&ar_archive->accept, (char*)"data.tar");
llist_add_to(&control_tar_llist, (char*)"control.tar");
#if ENABLE_FEATURE_SEAMLESS_GZ
llist_add_to(&ar_archive->accept, (char*)"data.tar.gz");
llist_add_to(&control_tar_llist, (char*)"control.tar.gz");
#endif
#if ENABLE_FEATURE_SEAMLESS_BZ2
llist_add_to(&ar_archive->accept, (char*)"data.tar.bz2");
llist_add_to(&control_tar_llist, (char*)"control.tar.bz2");
#endif
#if ENABLE_FEATURE_SEAMLESS_LZMA
llist_add_to(&ar_archive->accept, (char*)"data.tar.lzma");
llist_add_to(&control_tar_llist, (char*)"control.tar.lzma");
#endif
#if ENABLE_FEATURE_SEAMLESS_XZ
llist_add_to(&ar_archive->accept, (char*)"data.tar.xz");
llist_add_to(&control_tar_llist, (char*)"control.tar.xz");
#endif
/* Must have 1 or 2 args */
opt = getopt32(argv, "^" "cefXx"
"\0" "-1:?2:c--efXx:e--cfXx:f--ceXx:X--cefx:x--cefX"
);
argv += optind;
//argc -= optind;
extract_dir = argv[1];
if (opt & DPKG_DEB_OPT_CONTENTS) { // -c
tar_archive->action_header = header_verbose_list;
if (extract_dir)
bb_show_usage();
}
if (opt & DPKG_DEB_OPT_FIELD) { // -f
/* Print the entire control file */
//TODO: standard tool accepts an optional list of fields to print
ar_archive->accept = control_tar_llist;
llist_add_to(&(tar_archive->accept), (char*)"./control");
tar_archive->filter = filter_accept_list;
tar_archive->action_data = data_extract_to_stdout;
if (extract_dir)
bb_show_usage();
}
if (opt & DPKG_DEB_OPT_CONTROL) { // -e
ar_archive->accept = control_tar_llist;
tar_archive->action_data = data_extract_all;
if (!extract_dir)
extract_dir = "./DEBIAN";
}
if (opt & (DPKG_DEB_OPT_EXTRACT_VERBOSE | DPKG_DEB_OPT_EXTRACT)) { // -Xx
if (opt & DPKG_DEB_OPT_EXTRACT_VERBOSE)
tar_archive->action_header = header_list;
tar_archive->action_data = data_extract_all;
if (!extract_dir)
bb_show_usage();
}
/* Standard tool supports "-" */
tar_archive->src_fd = ar_archive->src_fd = xopen_stdin(argv[0]);
if (extract_dir) {
mkdir(extract_dir, 0777); /* bb_make_directory(extract_dir, 0777, 0) */
xchdir(extract_dir);
}
/* Do it */
unpack_ar_archive(ar_archive);
/* Cleanup */
if (ENABLE_FEATURE_CLEAN_UP)
close(ar_archive->src_fd);
return EXIT_SUCCESS;
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,98 @@
# Makefile for busybox
#
# Copyright (C) 1999-2004 by Erik Andersen <andersen@codepoet.org>
#
# Licensed under GPLv2 or later, see file LICENSE in this source tree.
lib-y:= common.o
COMMON_FILES:= \
\
data_skip.o \
data_extract_all.o \
data_extract_to_stdout.o \
\
unsafe_symlink_target.o \
\
filter_accept_all.o \
filter_accept_list.o \
filter_accept_reject_list.o \
\
header_skip.o \
header_list.o \
header_verbose_list.o \
\
seek_by_read.o \
seek_by_jump.o \
\
data_align.o \
find_list_entry.o \
init_handle.o
DPKG_FILES:= \
unpack_ar_archive.o \
filter_accept_list_reassign.o \
unsafe_prefix.o \
get_header_ar.o \
get_header_tar.o \
get_header_tar_gz.o \
get_header_tar_bz2.o \
get_header_tar_lzma.o \
get_header_tar_xz.o \
INSERT
lib-$(CONFIG_DPKG) += $(DPKG_FILES)
lib-$(CONFIG_DPKG_DEB) += $(DPKG_FILES)
lib-$(CONFIG_AR) += get_header_ar.o unpack_ar_archive.o
lib-$(CONFIG_CPIO) += get_header_cpio.o
lib-$(CONFIG_TAR) += get_header_tar.o unsafe_prefix.o
lib-$(CONFIG_FEATURE_TAR_TO_COMMAND) += data_extract_to_command.o
lib-$(CONFIG_LZOP) += lzo1x_1.o lzo1x_1o.o lzo1x_d.o
lib-$(CONFIG_UNLZOP) += lzo1x_1.o lzo1x_1o.o lzo1x_d.o
lib-$(CONFIG_LZOPCAT) += lzo1x_1.o lzo1x_1o.o lzo1x_d.o
lib-$(CONFIG_LZOP_COMPR_HIGH) += lzo1x_9x.o
# 'bzip2 -d', bunzip2 or bzcat selects FEATURE_BZIP2_DECOMPRESS
lib-$(CONFIG_FEATURE_BZIP2_DECOMPRESS) += open_transformer.o decompress_bunzip2.o
lib-$(CONFIG_FEATURE_UNZIP_BZIP2) += open_transformer.o decompress_bunzip2.o
lib-$(CONFIG_UNLZMA) += open_transformer.o decompress_unlzma.o
lib-$(CONFIG_LZCAT) += open_transformer.o decompress_unlzma.o
lib-$(CONFIG_LZMA) += open_transformer.o decompress_unlzma.o
lib-$(CONFIG_FEATURE_UNZIP_LZMA) += open_transformer.o decompress_unlzma.o
lib-$(CONFIG_UNXZ) += open_transformer.o decompress_unxz.o
lib-$(CONFIG_XZCAT) += open_transformer.o decompress_unxz.o
lib-$(CONFIG_XZ) += open_transformer.o decompress_unxz.o
lib-$(CONFIG_FEATURE_UNZIP_XZ) += open_transformer.o decompress_unxz.o
# 'gzip -d', gunzip or zcat selects FEATURE_GZIP_DECOMPRESS
lib-$(CONFIG_FEATURE_GZIP_DECOMPRESS) += open_transformer.o decompress_gunzip.o
lib-$(CONFIG_UNCOMPRESS) += open_transformer.o decompress_uncompress.o
lib-$(CONFIG_UNZIP) += open_transformer.o decompress_gunzip.o unsafe_prefix.o
lib-$(CONFIG_RPM2CPIO) += open_transformer.o decompress_gunzip.o get_header_cpio.o
lib-$(CONFIG_RPM) += open_transformer.o decompress_gunzip.o get_header_cpio.o
lib-$(CONFIG_GZIP) += open_transformer.o
lib-$(CONFIG_BZIP2) += open_transformer.o
lib-$(CONFIG_LZOP) += open_transformer.o
lib-$(CONFIG_MAN) += open_transformer.o
lib-$(CONFIG_SETFONT) += open_transformer.o
lib-$(CONFIG_FEATURE_2_4_MODULES) += open_transformer.o
lib-$(CONFIG_MODINFO) += open_transformer.o
lib-$(CONFIG_INSMOD) += open_transformer.o
lib-$(CONFIG_DEPMOD) += open_transformer.o
lib-$(CONFIG_RMMOD) += open_transformer.o
lib-$(CONFIG_LSMOD) += open_transformer.o
lib-$(CONFIG_MODPROBE) += open_transformer.o
lib-$(CONFIG_MODPROBE_SMALL) += open_transformer.o
lib-$(CONFIG_FEATURE_SEAMLESS_Z) += open_transformer.o decompress_uncompress.o
lib-$(CONFIG_FEATURE_SEAMLESS_GZ) += open_transformer.o decompress_gunzip.o
lib-$(CONFIG_FEATURE_SEAMLESS_BZ2) += open_transformer.o decompress_bunzip2.o
lib-$(CONFIG_FEATURE_SEAMLESS_LZMA) += open_transformer.o decompress_unlzma.o
lib-$(CONFIG_FEATURE_SEAMLESS_XZ) += open_transformer.o decompress_unxz.o
lib-$(CONFIG_FEATURE_COMPRESS_USAGE) += open_transformer.o decompress_bunzip2.o
lib-$(CONFIG_FEATURE_COMPRESS_BBCONFIG) += open_transformer.o decompress_bunzip2.o
lib-$(CONFIG_FEATURE_SH_EMBEDDED_SCRIPTS) += open_transformer.o decompress_bunzip2.o
ifneq ($(lib-y),)
lib-y += $(COMMON_FILES)
endif

View File

@ -0,0 +1,44 @@
bzip2 applet in busybox is based on lightly-modified source
of bzip2 version 1.0.4. bzip2 source is distributed
under the following conditions (copied verbatim from LICENSE file)
===========================================================
This program, "bzip2", the associated library "libbzip2", and all
documentation, are copyright (C) 1996-2006 Julian R Seward. All
rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. The origin of this software must not be misrepresented; you must
not claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product
documentation would be appreciated but is not required.
3. Altered source versions must be plainly marked as such, and must
not be misrepresented as being the original software.
4. The name of the author may not be used to endorse or promote
products derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Julian Seward, Cambridge, UK.
jseward@bzip.org
bzip2/libbzip2 version 1.0.4 of 20 December 2006

View File

@ -0,0 +1,90 @@
This file is an abridged version of README from bzip2 1.0.4
Build instructions (which are not relevant to busyboxed bzip2)
are removed.
===========================================================
This is the README for bzip2/libzip2.
This version is fully compatible with the previous public releases.
------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.4 of 20 December 2006
Copyright (C) 1996-2006 Julian Seward <jseward@bzip.org>
Please read the WARNING, DISCLAIMER and PATENTS sections in this file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------
Please read and be aware of the following:
WARNING:
This program and library (attempts to) compress data by
performing several non-trivial transformations on it.
Unless you are 100% familiar with *all* the algorithms
contained herein, and with the consequences of modifying them,
you should NOT meddle with the compression or decompression
machinery. Incorrect changes can and very likely *will*
lead to disastrous loss of data.
DISCLAIMER:
I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA ARISING FROM THE
USE OF THIS PROGRAM/LIBRARY, HOWSOEVER CAUSED.
Every compression of a file implies an assumption that the
compressed file can be decompressed to reproduce the original.
Great efforts in design, coding and testing have been made to
ensure that this program works correctly. However, the complexity
of the algorithms, and, in particular, the presence of various
special cases in the code which occur with very low but non-zero
probability make it impossible to rule out the possibility of bugs
remaining in the program. DO NOT COMPRESS ANY DATA WITH THIS
PROGRAM UNLESS YOU ARE PREPARED TO ACCEPT THE POSSIBILITY, HOWEVER
SMALL, THAT THE DATA WILL NOT BE RECOVERABLE.
That is not to say this program is inherently unreliable.
Indeed, I very much hope the opposite is true. bzip2/libbzip2
has been carefully constructed and extensively tested.
PATENTS:
To the best of my knowledge, bzip2/libbzip2 does not use any
patented algorithms. However, I do not have the resources
to carry out a patent search. Therefore I cannot give any
guarantee of the above statement.
I hope you find bzip2 useful. Feel free to contact me at
jseward@bzip.org
if you have any suggestions or queries. Many people mailed me with
comments, suggestions and patches after the releases of bzip-0.15,
bzip-0.21, and bzip2 versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0, 1.0.1,
1.0.2 and 1.0.3, and the changes in bzip2 are largely a result of this
feedback. I thank you for your comments.
bzip2's "home" is http://www.bzip.org/
Julian Seward
jseward@bzip.org
Cambridge, UK.
18 July 1996 (version 0.15)
25 August 1996 (version 0.21)
7 August 1997 (bzip2, version 0.1)
29 August 1997 (bzip2, version 0.1pl2)
23 August 1998 (bzip2, version 0.9.0)
8 June 1999 (bzip2, version 0.9.5)
4 Sept 1999 (bzip2, version 0.9.5d)
5 May 2000 (bzip2, version 1.0pre8)
30 December 2001 (bzip2, version 1.0.2pre1)
15 February 2005 (bzip2, version 1.0.3)
20 December 2006 (bzip2, version 1.0.4)

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,428 @@
/*
* bzip2 is written by Julian Seward <jseward@bzip.org>.
* Adapted for busybox by Denys Vlasenko <vda.linux@googlemail.com>.
* See README and LICENSE files in this directory for more information.
*/
/*-------------------------------------------------------------*/
/*--- Library top-level functions. ---*/
/*--- bzlib.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.4 of 20 December 2006
Copyright (C) 1996-2006 Julian Seward <jseward@bzip.org>
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
/* CHANGES
* 0.9.0 -- original version.
* 0.9.0a/b -- no changes in this file.
* 0.9.0c -- made zero-length BZ_FLUSH work correctly in bzCompress().
* fixed bzWrite/bzRead to ignore zero-length requests.
* fixed bzread to correctly handle read requests after EOF.
* wrong parameter order in call to bzDecompressInit in
* bzBuffToBuffDecompress. Fixed.
*/
/* #include "bzlib_private.h" */
/*---------------------------------------------------*/
/*--- Compression stuff ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
#if BZ_LIGHT_DEBUG
static
void bz_assert_fail(int errcode)
{
/* if (errcode == 1007) bb_error_msg_and_die("probably bad RAM"); */
bb_error_msg_and_die("internal error %d", errcode);
}
#endif
/*---------------------------------------------------*/
static
void prepare_new_block(EState* s)
{
int i;
s->nblock = 0;
//indexes into s->zbits[], initialzation moved to init of s->zbits
//s->posZ = s->zbits; // was: s->numZ = 0;
//s->state_out_pos = s->zbits;
BZ_INITIALISE_CRC(s->blockCRC);
/* inlined memset would be nice to have here */
for (i = 0; i < 256; i++)
s->inUse[i] = 0;
s->blockNo++;
}
/*---------------------------------------------------*/
static
ALWAYS_INLINE
void init_RL(EState* s)
{
s->state_in_ch = 256;
s->state_in_len = 0;
}
static
int isempty_RL(EState* s)
{
return (s->state_in_ch >= 256 || s->state_in_len <= 0);
}
/*---------------------------------------------------*/
static
void BZ2_bzCompressInit(bz_stream *strm, int blockSize100k)
{
unsigned n;
EState* s;
s = xzalloc(sizeof(EState));
s->strm = strm;
n = 100000 * blockSize100k;
s->arr1 = xmalloc(n * sizeof(uint32_t));
s->mtfv = (uint16_t*)s->arr1;
s->ptr = (uint32_t*)s->arr1;
s->arr2 = xmalloc((n + BZ_N_OVERSHOOT) * sizeof(uint32_t));
s->block = (uint8_t*)s->arr2;
crc32_filltable(s->crc32table, 1);
s->state = BZ_S_INPUT;
s->mode = BZ_M_RUNNING;
s->blockSize100k = blockSize100k;
s->nblockMAX = n - 19;
strm->state = s;
/*strm->total_in = 0;*/
strm->total_out = 0;
init_RL(s);
prepare_new_block(s);
}
/*---------------------------------------------------*/
static
void add_pair_to_block(EState* s)
{
int32_t i;
uint8_t ch = (uint8_t)(s->state_in_ch);
for (i = 0; i < s->state_in_len; i++) {
BZ_UPDATE_CRC(s, s->blockCRC, ch);
}
s->inUse[s->state_in_ch] = 1;
switch (s->state_in_len) {
case 3:
s->block[s->nblock] = (uint8_t)ch; s->nblock++;
/* fall through */
case 2:
s->block[s->nblock] = (uint8_t)ch; s->nblock++;
/* fall through */
case 1:
s->block[s->nblock] = (uint8_t)ch; s->nblock++;
break;
default:
s->inUse[s->state_in_len - 4] = 1;
s->block[s->nblock] = (uint8_t)ch; s->nblock++;
s->block[s->nblock] = (uint8_t)ch; s->nblock++;
s->block[s->nblock] = (uint8_t)ch; s->nblock++;
s->block[s->nblock] = (uint8_t)ch; s->nblock++;
s->block[s->nblock] = (uint8_t)(s->state_in_len - 4);
s->nblock++;
break;
}
}
/*---------------------------------------------------*/
static
void flush_RL(EState* s)
{
if (s->state_in_ch < 256) add_pair_to_block(s);
init_RL(s);
}
/*---------------------------------------------------*/
#define ADD_CHAR_TO_BLOCK(zs, zchh0) \
{ \
uint32_t zchh = (uint32_t)(zchh0); \
/*-- fast track the common case --*/ \
if (zchh != zs->state_in_ch && zs->state_in_len == 1) { \
uint8_t ch = (uint8_t)(zs->state_in_ch); \
BZ_UPDATE_CRC(zs, zs->blockCRC, ch); \
zs->inUse[zs->state_in_ch] = 1; \
zs->block[zs->nblock] = (uint8_t)ch; \
zs->nblock++; \
zs->state_in_ch = zchh; \
} \
else \
/*-- general, uncommon cases --*/ \
if (zchh != zs->state_in_ch || zs->state_in_len == 255) { \
if (zs->state_in_ch < 256) \
add_pair_to_block(zs); \
zs->state_in_ch = zchh; \
zs->state_in_len = 1; \
} else { \
zs->state_in_len++; \
} \
}
/*---------------------------------------------------*/
static
void /*Bool*/ copy_input_until_stop(EState* s)
{
/*Bool progress_in = False;*/
#ifdef SAME_CODE_AS_BELOW
if (s->mode == BZ_M_RUNNING) {
/*-- fast track the common case --*/
while (1) {
/*-- no input? --*/
if (s->strm->avail_in == 0) break;
/*-- block full? --*/
if (s->nblock >= s->nblockMAX) break;
/*progress_in = True;*/
ADD_CHAR_TO_BLOCK(s, (uint32_t)(*(uint8_t*)(s->strm->next_in)));
s->strm->next_in++;
s->strm->avail_in--;
/*s->strm->total_in++;*/
}
} else
#endif
{
/*-- general, uncommon case --*/
while (1) {
/*-- no input? --*/
if (s->strm->avail_in == 0) break;
/*-- block full? --*/
if (s->nblock >= s->nblockMAX) break;
//# /*-- flush/finish end? --*/
//# if (s->avail_in_expect == 0) break;
/*progress_in = True;*/
ADD_CHAR_TO_BLOCK(s, *(uint8_t*)(s->strm->next_in));
s->strm->next_in++;
s->strm->avail_in--;
/*s->strm->total_in++;*/
//# s->avail_in_expect--;
}
}
/*return progress_in;*/
}
/*---------------------------------------------------*/
static
void /*Bool*/ copy_output_until_stop(EState* s)
{
/*Bool progress_out = False;*/
while (1) {
/*-- no output space? --*/
if (s->strm->avail_out == 0) break;
/*-- block done? --*/
if (s->state_out_pos >= s->posZ) break;
/*progress_out = True;*/
*(s->strm->next_out) = *s->state_out_pos++;
s->strm->avail_out--;
s->strm->next_out++;
s->strm->total_out++;
}
/*return progress_out;*/
}
/*---------------------------------------------------*/
static
void /*Bool*/ handle_compress(bz_stream *strm)
{
/*Bool progress_in = False;*/
/*Bool progress_out = False;*/
EState* s = strm->state;
while (1) {
if (s->state == BZ_S_OUTPUT) {
/*progress_out |=*/ copy_output_until_stop(s);
if (s->state_out_pos < s->posZ) break;
if (s->mode == BZ_M_FINISHING
//# && s->avail_in_expect == 0
&& s->strm->avail_in == 0
&& isempty_RL(s))
break;
prepare_new_block(s);
s->state = BZ_S_INPUT;
#ifdef FLUSH_IS_UNUSED
if (s->mode == BZ_M_FLUSHING
&& s->avail_in_expect == 0
&& isempty_RL(s))
break;
#endif
}
if (s->state == BZ_S_INPUT) {
/*progress_in |=*/ copy_input_until_stop(s);
//#if (s->mode != BZ_M_RUNNING && s->avail_in_expect == 0) {
if (s->mode != BZ_M_RUNNING && s->strm->avail_in == 0) {
flush_RL(s);
BZ2_compressBlock(s, (s->mode == BZ_M_FINISHING));
s->state = BZ_S_OUTPUT;
} else
if (s->nblock >= s->nblockMAX) {
BZ2_compressBlock(s, 0);
s->state = BZ_S_OUTPUT;
} else
if (s->strm->avail_in == 0) {
break;
}
}
}
/*return progress_in || progress_out;*/
}
/*---------------------------------------------------*/
static
int BZ2_bzCompress(bz_stream *strm, int action)
{
/*Bool progress;*/
EState* s;
s = strm->state;
switch (s->mode) {
case BZ_M_RUNNING:
if (action == BZ_RUN) {
/*progress =*/ handle_compress(strm);
/*return progress ? BZ_RUN_OK : BZ_PARAM_ERROR;*/
return BZ_RUN_OK;
}
#ifdef FLUSH_IS_UNUSED
else
if (action == BZ_FLUSH) {
//#s->avail_in_expect = strm->avail_in;
s->mode = BZ_M_FLUSHING;
goto case_BZ_M_FLUSHING;
}
#endif
else
/*if (action == BZ_FINISH)*/ {
//#s->avail_in_expect = strm->avail_in;
s->mode = BZ_M_FINISHING;
goto case_BZ_M_FINISHING;
}
#ifdef FLUSH_IS_UNUSED
case_BZ_M_FLUSHING:
case BZ_M_FLUSHING:
/*if (s->avail_in_expect != s->strm->avail_in)
return BZ_SEQUENCE_ERROR;*/
/*progress =*/ handle_compress(strm);
if (s->avail_in_expect > 0 || !isempty_RL(s) || s->state_out_pos < s->posZ)
return BZ_FLUSH_OK;
s->mode = BZ_M_RUNNING;
return BZ_RUN_OK;
#endif
case_BZ_M_FINISHING:
/*case BZ_M_FINISHING:*/
default:
/*if (s->avail_in_expect != s->strm->avail_in)
return BZ_SEQUENCE_ERROR;*/
/*progress =*/ handle_compress(strm);
/*if (!progress) return BZ_SEQUENCE_ERROR;*/
//#if (s->avail_in_expect > 0 || !isempty_RL(s) || s->state_out_pos < s->posZ)
//# return BZ_FINISH_OK;
if (s->strm->avail_in > 0 || !isempty_RL(s) || s->state_out_pos < s->posZ)
return BZ_FINISH_OK;
/*s->mode = BZ_M_IDLE;*/
return BZ_STREAM_END;
}
/* return BZ_OK; --not reached--*/
}
/*---------------------------------------------------*/
static
void BZ2_bzCompressEnd(bz_stream *strm)
{
EState* s;
s = strm->state;
free(s->arr1);
free(s->arr2);
//free(s->ftab); // made it array member of s
//free(s->crc32table); // ditto
free(s);
}
/*---------------------------------------------------*/
/*--- Misc convenience stuff ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
#ifdef EXAMPLE_CODE_FOR_MEM_TO_MEM_COMPRESSION
static
int BZ2_bzBuffToBuffCompress(char* dest,
unsigned int* destLen,
char* source,
unsigned int sourceLen,
int blockSize100k)
{
bz_stream strm;
int ret;
if (dest == NULL || destLen == NULL
|| source == NULL
|| blockSize100k < 1 || blockSize100k > 9
) {
return BZ_PARAM_ERROR;
}
BZ2_bzCompressInit(&strm, blockSize100k);
strm.next_in = source;
strm.next_out = dest;
strm.avail_in = sourceLen;
strm.avail_out = *destLen;
ret = BZ2_bzCompress(&strm, BZ_FINISH);
if (ret == BZ_FINISH_OK) goto output_overflow;
if (ret != BZ_STREAM_END) goto errhandler;
/* normal termination */
*destLen -= strm.avail_out;
BZ2_bzCompressEnd(&strm);
return BZ_OK;
output_overflow:
BZ2_bzCompressEnd(&strm);
return BZ_OUTBUFF_FULL;
errhandler:
BZ2_bzCompressEnd(&strm);
return ret;
}
#endif
/*-------------------------------------------------------------*/
/*--- end bzlib.c ---*/
/*-------------------------------------------------------------*/

View File

@ -0,0 +1,65 @@
/*
* bzip2 is written by Julian Seward <jseward@bzip.org>.
* Adapted for busybox by Denys Vlasenko <vda.linux@googlemail.com>.
* See README and LICENSE files in this directory for more information.
*/
/*-------------------------------------------------------------*/
/*--- Public header file for the library. ---*/
/*--- bzlib.h ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.4 of 20 December 2006
Copyright (C) 1996-2006 Julian Seward <jseward@bzip.org>
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
#define BZ_RUN 0
#define BZ_FLUSH 1
#define BZ_FINISH 2
#define BZ_OK 0
#define BZ_RUN_OK 1
#define BZ_FLUSH_OK 2
#define BZ_FINISH_OK 3
#define BZ_STREAM_END 4
#define BZ_SEQUENCE_ERROR (-1)
#define BZ_PARAM_ERROR (-2)
#define BZ_MEM_ERROR (-3)
#define BZ_DATA_ERROR (-4)
#define BZ_DATA_ERROR_MAGIC (-5)
#define BZ_IO_ERROR (-6)
#define BZ_UNEXPECTED_EOF (-7)
#define BZ_OUTBUFF_FULL (-8)
#define BZ_CONFIG_ERROR (-9)
typedef struct bz_stream {
void *state;
char *next_in;
char *next_out;
unsigned avail_in;
unsigned avail_out;
/*unsigned long long total_in;*/
unsigned long long total_out;
} bz_stream;
/*-- Core (low-level) library functions --*/
static void BZ2_bzCompressInit(bz_stream *strm, int blockSize100k);
static int BZ2_bzCompress(bz_stream *strm, int action);
#if ENABLE_FEATURE_CLEAN_UP
static void BZ2_bzCompressEnd(bz_stream *strm);
#endif
/*-------------------------------------------------------------*/
/*--- end bzlib.h ---*/
/*-------------------------------------------------------------*/

View File

@ -0,0 +1,226 @@
/*
* bzip2 is written by Julian Seward <jseward@bzip.org>.
* Adapted for busybox by Denys Vlasenko <vda.linux@googlemail.com>.
* See README and LICENSE files in this directory for more information.
*/
/*-------------------------------------------------------------*/
/*--- Private header file for the library. ---*/
/*--- bzlib_private.h ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.4 of 20 December 2006
Copyright (C) 1996-2006 Julian Seward <jseward@bzip.org>
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
/* #include "bzlib.h" */
/*-- General stuff. --*/
typedef unsigned char Bool;
#define True ((Bool)1)
#define False ((Bool)0)
#if BZ_LIGHT_DEBUG
static void bz_assert_fail(int errcode) NORETURN;
#define AssertH(cond, errcode) \
do { \
if (!(cond)) \
bz_assert_fail(errcode); \
} while (0)
#else
#define AssertH(cond, msg) do { } while (0)
#endif
#if BZ_DEBUG
#define AssertD(cond, msg) \
do { \
if (!(cond)) \
bb_error_msg_and_die("(debug build): internal error %s", msg); \
} while (0)
#else
#define AssertD(cond, msg) do { } while (0)
#endif
/*-- Header bytes. --*/
#define BZ_HDR_B 0x42 /* 'B' */
#define BZ_HDR_Z 0x5a /* 'Z' */
#define BZ_HDR_h 0x68 /* 'h' */
#define BZ_HDR_0 0x30 /* '0' */
#define BZ_HDR_BZh0 0x425a6830
/*-- Constants for the back end. --*/
#define BZ_MAX_ALPHA_SIZE 258
#define BZ_MAX_CODE_LEN 23
#define BZ_RUNA 0
#define BZ_RUNB 1
#define BZ_N_GROUPS 6
#define BZ_G_SIZE 50
#define BZ_N_ITERS 4
#define BZ_MAX_SELECTORS (2 + (900000 / BZ_G_SIZE))
/*-- Stuff for doing CRCs. --*/
#define BZ_INITIALISE_CRC(crcVar) \
{ \
crcVar = 0xffffffffL; \
}
#define BZ_FINALISE_CRC(crcVar) \
{ \
crcVar = ~(crcVar); \
}
#define BZ_UPDATE_CRC(s, crcVar, cha) \
{ \
crcVar = (crcVar << 8) ^ s->crc32table[(crcVar >> 24) ^ ((uint8_t)cha)]; \
}
/*-- States and modes for compression. --*/
#define BZ_M_IDLE 1
#define BZ_M_RUNNING 2
#define BZ_M_FLUSHING 3
#define BZ_M_FINISHING 4
#define BZ_S_OUTPUT 1
#define BZ_S_INPUT 2
#define BZ_N_RADIX 2
#define BZ_N_QSORT 12
#define BZ_N_SHELL 18
#define BZ_N_OVERSHOOT (BZ_N_RADIX + BZ_N_QSORT + BZ_N_SHELL + 2)
/*-- Structure holding all the compression-side stuff. --*/
typedef struct EState {
/* pointer back to the struct bz_stream */
bz_stream *strm;
/* mode this stream is in, and whether inputting */
/* or outputting data */
uint8_t mode;
uint8_t state;
/* misc administratium */
uint8_t blockSize100k;
/* remembers avail_in when flush/finish requested */
/* bbox: not needed, strm->avail_in always has the same value */
/* commented out with '//#' throughout the code */
/* uint32_t avail_in_expect; */
/* for doing the block sorting */
uint32_t *arr1;
uint32_t *arr2;
//uint32_t *ftab; //moved into this struct, see below
uint16_t *quadrant;
int32_t budget;
/* aliases for arr1 and arr2 */
uint32_t *ptr;
uint8_t *block;
uint16_t *mtfv;
uint8_t *zbits;
/* run-length-encoding of the input */
uint32_t state_in_ch;
int32_t state_in_len;
/* input and output limits and current posns */
int32_t nblock;
int32_t nblockMAX;
//int32_t numZ; // index into s->zbits[], replaced by pointer:
uint8_t *posZ;
uint8_t *state_out_pos;
/* the buffer for bit stream creation */
uint32_t bsBuff;
int32_t bsLive;
/* block and combined CRCs */
uint32_t blockCRC;
uint32_t combinedCRC;
/* misc administratium */
int32_t blockNo;
/* stuff for coding the MTF values */
int32_t nMTF;
/* map of bytes used in block */
int32_t nInUse;
Bool inUse[256] ALIGNED(sizeof(long));
uint8_t unseqToSeq[256];
/* stuff for coding the MTF values */
int32_t mtfFreq [BZ_MAX_ALPHA_SIZE];
uint8_t selector [BZ_MAX_SELECTORS];
uint8_t selectorMtf[BZ_MAX_SELECTORS];
uint8_t len[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
/* guess what */
uint32_t crc32table[256];
/* for doing the block sorting */
uint32_t ftab[65537];
/* stack-saving measures: these can be local, but they are too big */
int32_t sendMTFValues__code [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
int32_t sendMTFValues__rfreq[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
#if BZIP2_SPEED >= 5
/* second dimension: only 3 needed; 4 makes index calculations faster */
uint32_t sendMTFValues__len_pack[BZ_MAX_ALPHA_SIZE][4];
#endif
int32_t BZ2_hbMakeCodeLengths__heap [BZ_MAX_ALPHA_SIZE + 2];
int32_t BZ2_hbMakeCodeLengths__weight[BZ_MAX_ALPHA_SIZE * 2];
int32_t BZ2_hbMakeCodeLengths__parent[BZ_MAX_ALPHA_SIZE * 2];
int32_t mainSort__copyStart[256];
int32_t mainSort__copyEnd[256];
} EState;
/*-- compression. --*/
static int32_t
BZ2_blockSort(EState*);
static void
BZ2_compressBlock(EState*, int);
static void
BZ2_bsInitWrite(EState*);
static void
BZ2_hbAssignCodes(int32_t*, uint8_t*, int32_t, int32_t, int32_t);
static void
BZ2_hbMakeCodeLengths(EState*, uint8_t*, int32_t*, int32_t, int32_t);
/*-------------------------------------------------------------*/
/*--- end bzlib_private.h ---*/
/*-------------------------------------------------------------*/

View File

@ -0,0 +1,752 @@
/*
* bzip2 is written by Julian Seward <jseward@bzip.org>.
* Adapted for busybox by Denys Vlasenko <vda.linux@googlemail.com>.
* See README and LICENSE files in this directory for more information.
*/
/*-------------------------------------------------------------*/
/*--- Compression machinery (not incl block sorting) ---*/
/*--- compress.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.4 of 20 December 2006
Copyright (C) 1996-2006 Julian Seward <jseward@bzip.org>
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
/* CHANGES
* 0.9.0 -- original version.
* 0.9.0a/b -- no changes in this file.
* 0.9.0c -- changed setting of nGroups in sendMTFValues()
* so as to do a bit better on small files
*/
/* #include "bzlib_private.h" */
#if BZIP2_SPEED >= 5
# define ALWAYS_INLINE_5 ALWAYS_INLINE
#else
# define ALWAYS_INLINE_5 /*nothing*/
#endif
/*---------------------------------------------------*/
/*--- Bit stream I/O ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
static
void BZ2_bsInitWrite(EState* s)
{
s->bsLive = 0;
s->bsBuff = 0;
}
/*---------------------------------------------------*/
static NOINLINE
void bsFinishWrite(EState* s)
{
while (s->bsLive > 0) {
*s->posZ++ = (uint8_t)(s->bsBuff >> 24);
s->bsBuff <<= 8;
s->bsLive -= 8;
}
}
/*---------------------------------------------------*/
static
/* Helps only on level 5, on other levels hurts. ? */
ALWAYS_INLINE_5
void bsW(EState* s, int32_t n, uint32_t v)
{
while (s->bsLive >= 8) {
*s->posZ++ = (uint8_t)(s->bsBuff >> 24);
s->bsBuff <<= 8;
s->bsLive -= 8;
}
s->bsBuff |= (v << (32 - s->bsLive - n));
s->bsLive += n;
}
/* Same with n == 16: */
static
ALWAYS_INLINE_5
void bsW16(EState* s, uint32_t v)
{
while (s->bsLive >= 8) {
*s->posZ++ = (uint8_t)(s->bsBuff >> 24);
s->bsBuff <<= 8;
s->bsLive -= 8;
}
s->bsBuff |= (v << (16 - s->bsLive));
s->bsLive += 16;
}
/* Same with n == 1: */
static
ALWAYS_INLINE /* one callsite */
void bsW1_1(EState* s)
{
/* need space for only 1 bit, no need for loop freeing > 8 bits */
if (s->bsLive >= 8) {
*s->posZ++ = (uint8_t)(s->bsBuff >> 24);
s->bsBuff <<= 8;
s->bsLive -= 8;
}
s->bsBuff |= (1 << (31 - s->bsLive));
s->bsLive += 1;
}
static
ALWAYS_INLINE_5
void bsW1_0(EState* s)
{
/* need space for only 1 bit, no need for loop freeing > 8 bits */
if (s->bsLive >= 8) {
*s->posZ++ = (uint8_t)(s->bsBuff >> 24);
s->bsBuff <<= 8;
s->bsLive -= 8;
}
//s->bsBuff |= (0 << (31 - s->bsLive));
s->bsLive += 1;
}
/*---------------------------------------------------*/
static ALWAYS_INLINE
void bsPutU16(EState* s, unsigned u)
{
bsW16(s, u);
}
/*---------------------------------------------------*/
static
void bsPutU32(EState* s, unsigned u)
{
//bsW(s, 32, u); // can't use: may try "uint32 << -n"
bsW16(s, (u >> 16) & 0xffff);
bsW16(s, u & 0xffff);
}
/*---------------------------------------------------*/
/*--- The back end proper ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
static
void makeMaps_e(EState* s)
{
int i;
unsigned cnt = 0;
for (i = 0; i < 256; i++) {
if (s->inUse[i]) {
s->unseqToSeq[i] = cnt;
cnt++;
}
}
s->nInUse = cnt;
}
/*---------------------------------------------------*/
/*
* This bit of code is performance-critical.
* On 32bit x86, gcc-6.3.0 was observed to spill ryy_j to stack,
* resulting in abysmal performance (x3 slowdown).
* Forcing it into a separate function alleviates register pressure,
* and spillage no longer happens.
* Other versions of gcc do not exhibit this problem, but out-of-line code
* seems to be helping them too (code is both smaller and faster).
* Therefore NOINLINE is enabled for the entire 32bit x86 arch for now,
* without a check for gcc version.
*/
static
#if defined __i386__
NOINLINE
#endif
int inner_loop(uint8_t *yy, uint8_t ll_i)
{
register uint8_t rtmp;
register uint8_t* ryy_j;
rtmp = yy[1];
yy[1] = yy[0];
ryy_j = &(yy[1]);
while (ll_i != rtmp) {
register uint8_t rtmp2;
ryy_j++;
rtmp2 = rtmp;
rtmp = *ryy_j;
*ryy_j = rtmp2;
}
yy[0] = rtmp;
return ryy_j - &(yy[0]);
}
static NOINLINE
void generateMTFValues(EState* s)
{
uint8_t yy[256];
int i;
int zPend;
int32_t wr;
/*
* After sorting (eg, here),
* s->arr1[0 .. s->nblock-1] holds sorted order,
* and
* ((uint8_t*)s->arr2)[0 .. s->nblock-1]
* holds the original block data.
*
* The first thing to do is generate the MTF values,
* and put them in ((uint16_t*)s->arr1)[0 .. s->nblock-1].
*
* Because there are strictly fewer or equal MTF values
* than block values, ptr values in this area are overwritten
* with MTF values only when they are no longer needed.
*
* The final compressed bitstream is generated into the
* area starting at &((uint8_t*)s->arr2)[s->nblock]
*
* These storage aliases are set up in bzCompressInit(),
* except for the last one, which is arranged in
* compressBlock().
*/
uint32_t* ptr = s->ptr;
makeMaps_e(s);
wr = 0;
zPend = 0;
for (i = 0; i <= s->nInUse+1; i++)
s->mtfFreq[i] = 0;
for (i = 0; i < s->nInUse; i++)
yy[i] = (uint8_t) i;
for (i = 0; i < s->nblock; i++) {
uint8_t ll_i = ll_i; /* gcc 4.3.1 thinks it may be used w/o init */
int32_t j;
AssertD(wr <= i, "generateMTFValues(1)");
j = ptr[i] - 1;
if (j < 0)
j += s->nblock;
ll_i = s->unseqToSeq[s->block[j]];
AssertD(ll_i < s->nInUse, "generateMTFValues(2a)");
if (yy[0] == ll_i) {
zPend++;
continue;
}
if (zPend > 0) {
process_zPend:
zPend--;
while (1) {
#if 0
if (zPend & 1) {
s->mtfv[wr] = BZ_RUNB; wr++;
s->mtfFreq[BZ_RUNB]++;
} else {
s->mtfv[wr] = BZ_RUNA; wr++;
s->mtfFreq[BZ_RUNA]++;
}
#else /* same as above, since BZ_RUNA is 0 and BZ_RUNB is 1 */
unsigned run = zPend & 1;
s->mtfv[wr] = run;
wr++;
s->mtfFreq[run]++;
#endif
zPend -= 2;
if (zPend < 0)
break;
zPend = (unsigned)zPend / 2;
/* bbox: unsigned div is easier */
}
if (i < 0) /* came via "goto process_zPend"? exit */
goto end;
zPend = 0;
}
j = inner_loop(yy, ll_i);
s->mtfv[wr] = j+1;
wr++;
s->mtfFreq[j+1]++;
}
i = -1;
if (zPend > 0)
goto process_zPend; /* "process it and come back here" */
end:
s->mtfv[wr] = s->nInUse+1;
wr++;
s->mtfFreq[s->nInUse+1]++;
s->nMTF = wr;
}
/*---------------------------------------------------*/
#define BZ_LESSER_ICOST 0
#define BZ_GREATER_ICOST 15
static NOINLINE
void sendMTFValues(EState* s)
{
int32_t t, i;
unsigned iter;
unsigned gs;
int32_t alphaSize;
unsigned nSelectors, selCtr;
int32_t nGroups;
/*
* uint8_t len[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
* is a global since the decoder also needs it.
*
* int32_t code[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
* int32_t rfreq[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
* are also globals only used in this proc.
* Made global to keep stack frame size small.
*/
#define code sendMTFValues__code
#define rfreq sendMTFValues__rfreq
#define len_pack sendMTFValues__len_pack
unsigned /*uint16_t*/ cost[BZ_N_GROUPS];
uint16_t* mtfv = s->mtfv;
alphaSize = s->nInUse + 2;
for (t = 0; t < BZ_N_GROUPS; t++) {
unsigned v;
for (v = 0; v < alphaSize; v++)
s->len[t][v] = BZ_GREATER_ICOST;
}
/*--- Decide how many coding tables to use ---*/
AssertH(s->nMTF > 0, 3001);
// 1..199 = 2
// 200..599 = 3
// 600..1199 = 4
// 1200..2399 = 5
// 2400..99999 = 6
nGroups = 2;
nGroups += (s->nMTF >= 200);
nGroups += (s->nMTF >= 600);
nGroups += (s->nMTF >= 1200);
nGroups += (s->nMTF >= 2400);
/*--- Generate an initial set of coding tables ---*/
{
unsigned nPart, remF;
nPart = nGroups;
remF = s->nMTF;
gs = 0;
while (nPart > 0) {
unsigned v;
unsigned ge;
unsigned tFreq, aFreq;
tFreq = remF / nPart;
ge = gs;
aFreq = 0;
while (aFreq < tFreq && ge < alphaSize) {
aFreq += s->mtfFreq[ge++];
}
ge--;
if (ge > gs
&& nPart != nGroups && nPart != 1
&& ((nGroups - nPart) % 2 == 1) /* bbox: can this be replaced by x & 1? */
) {
aFreq -= s->mtfFreq[ge];
ge--;
}
for (v = 0; v < alphaSize; v++)
if (v >= gs && v <= ge)
s->len[nPart-1][v] = BZ_LESSER_ICOST;
else
s->len[nPart-1][v] = BZ_GREATER_ICOST;
nPart--;
gs = ge + 1;
remF -= aFreq;
}
}
/*
* Iterate up to BZ_N_ITERS times to improve the tables.
*/
for (iter = 0; iter < BZ_N_ITERS; iter++) {
for (t = 0; t < nGroups; t++) {
unsigned v;
for (v = 0; v < alphaSize; v++)
s->rfreq[t][v] = 0;
}
#if BZIP2_SPEED >= 5
/*
* Set up an auxiliary length table which is used to fast-track
* the common case (nGroups == 6).
*/
if (nGroups == 6) {
unsigned v;
for (v = 0; v < alphaSize; v++) {
s->len_pack[v][0] = (s->len[1][v] << 16) | s->len[0][v];
s->len_pack[v][1] = (s->len[3][v] << 16) | s->len[2][v];
s->len_pack[v][2] = (s->len[5][v] << 16) | s->len[4][v];
}
}
#endif
nSelectors = 0;
gs = 0;
while (1) {
unsigned ge;
unsigned bt, bc;
/*--- Set group start & end marks. --*/
if (gs >= s->nMTF)
break;
ge = gs + BZ_G_SIZE - 1;
if (ge >= s->nMTF)
ge = s->nMTF-1;
/*
* Calculate the cost of this group as coded
* by each of the coding tables.
*/
for (t = 0; t < nGroups; t++)
cost[t] = 0;
#if BZIP2_SPEED >= 5
if (nGroups == 6 && 50 == ge-gs+1) {
/*--- fast track the common case ---*/
register uint32_t cost01, cost23, cost45;
register uint16_t icv;
cost01 = cost23 = cost45 = 0;
#define BZ_ITER(nn) \
icv = mtfv[gs+(nn)]; \
cost01 += s->len_pack[icv][0]; \
cost23 += s->len_pack[icv][1]; \
cost45 += s->len_pack[icv][2];
BZ_ITER(0); BZ_ITER(1); BZ_ITER(2); BZ_ITER(3); BZ_ITER(4);
BZ_ITER(5); BZ_ITER(6); BZ_ITER(7); BZ_ITER(8); BZ_ITER(9);
BZ_ITER(10); BZ_ITER(11); BZ_ITER(12); BZ_ITER(13); BZ_ITER(14);
BZ_ITER(15); BZ_ITER(16); BZ_ITER(17); BZ_ITER(18); BZ_ITER(19);
BZ_ITER(20); BZ_ITER(21); BZ_ITER(22); BZ_ITER(23); BZ_ITER(24);
BZ_ITER(25); BZ_ITER(26); BZ_ITER(27); BZ_ITER(28); BZ_ITER(29);
BZ_ITER(30); BZ_ITER(31); BZ_ITER(32); BZ_ITER(33); BZ_ITER(34);
BZ_ITER(35); BZ_ITER(36); BZ_ITER(37); BZ_ITER(38); BZ_ITER(39);
BZ_ITER(40); BZ_ITER(41); BZ_ITER(42); BZ_ITER(43); BZ_ITER(44);
BZ_ITER(45); BZ_ITER(46); BZ_ITER(47); BZ_ITER(48); BZ_ITER(49);
#undef BZ_ITER
cost[0] = cost01 & 0xffff; cost[1] = cost01 >> 16;
cost[2] = cost23 & 0xffff; cost[3] = cost23 >> 16;
cost[4] = cost45 & 0xffff; cost[5] = cost45 >> 16;
} else
#endif
{
/*--- slow version which correctly handles all situations ---*/
for (i = gs; i <= ge; i++) {
unsigned /*uint16_t*/ icv = mtfv[i];
for (t = 0; t < nGroups; t++)
cost[t] += s->len[t][icv];
}
}
/*
* Find the coding table which is best for this group,
* and record its identity in the selector table.
*/
/*bc = 999999999;*/
/*bt = -1;*/
bc = cost[0];
bt = 0;
for (t = 1 /*0*/; t < nGroups; t++) {
if (cost[t] < bc) {
bc = cost[t];
bt = t;
}
}
s->selector[nSelectors] = bt;
nSelectors++;
/*
* Increment the symbol frequencies for the selected table.
*/
/* 1% faster compress. +800 bytes */
#if BZIP2_SPEED >= 4
if (nGroups == 6 && 50 == ge-gs+1) {
/*--- fast track the common case ---*/
#define BZ_ITUR(nn) s->rfreq[bt][mtfv[gs + (nn)]]++
BZ_ITUR(0); BZ_ITUR(1); BZ_ITUR(2); BZ_ITUR(3); BZ_ITUR(4);
BZ_ITUR(5); BZ_ITUR(6); BZ_ITUR(7); BZ_ITUR(8); BZ_ITUR(9);
BZ_ITUR(10); BZ_ITUR(11); BZ_ITUR(12); BZ_ITUR(13); BZ_ITUR(14);
BZ_ITUR(15); BZ_ITUR(16); BZ_ITUR(17); BZ_ITUR(18); BZ_ITUR(19);
BZ_ITUR(20); BZ_ITUR(21); BZ_ITUR(22); BZ_ITUR(23); BZ_ITUR(24);
BZ_ITUR(25); BZ_ITUR(26); BZ_ITUR(27); BZ_ITUR(28); BZ_ITUR(29);
BZ_ITUR(30); BZ_ITUR(31); BZ_ITUR(32); BZ_ITUR(33); BZ_ITUR(34);
BZ_ITUR(35); BZ_ITUR(36); BZ_ITUR(37); BZ_ITUR(38); BZ_ITUR(39);
BZ_ITUR(40); BZ_ITUR(41); BZ_ITUR(42); BZ_ITUR(43); BZ_ITUR(44);
BZ_ITUR(45); BZ_ITUR(46); BZ_ITUR(47); BZ_ITUR(48); BZ_ITUR(49);
#undef BZ_ITUR
gs = ge + 1;
} else
#endif
{
/*--- slow version which correctly handles all situations ---*/
while (gs <= ge) {
s->rfreq[bt][mtfv[gs]]++;
gs++;
}
/* already is: gs = ge + 1; */
}
}
/*
* Recompute the tables based on the accumulated frequencies.
*/
/* maxLen was changed from 20 to 17 in bzip2-1.0.3. See
* comment in huffman.c for details. */
for (t = 0; t < nGroups; t++)
BZ2_hbMakeCodeLengths(s, &(s->len[t][0]), &(s->rfreq[t][0]), alphaSize, 17 /*20*/);
}
AssertH(nGroups < 8, 3002);
AssertH(nSelectors < 32768 && nSelectors <= (2 + (900000 / BZ_G_SIZE)), 3003);
/*--- Compute MTF values for the selectors. ---*/
{
uint8_t pos[BZ_N_GROUPS], ll_i, tmp2, tmp;
for (i = 0; i < nGroups; i++)
pos[i] = i;
for (i = 0; i < nSelectors; i++) {
unsigned j;
ll_i = s->selector[i];
j = 0;
tmp = pos[j];
while (ll_i != tmp) {
j++;
tmp2 = tmp;
tmp = pos[j];
pos[j] = tmp2;
}
pos[0] = tmp;
s->selectorMtf[i] = j;
}
}
/*--- Assign actual codes for the tables. --*/
for (t = 0; t < nGroups; t++) {
unsigned minLen = 32; //todo: s->len[t][0];
unsigned maxLen = 0; //todo: s->len[t][0];
for (i = 0; i < alphaSize; i++) {
if (s->len[t][i] > maxLen) maxLen = s->len[t][i];
if (s->len[t][i] < minLen) minLen = s->len[t][i];
}
AssertH(!(maxLen > 17 /*20*/), 3004);
AssertH(!(minLen < 1), 3005);
BZ2_hbAssignCodes(&(s->code[t][0]), &(s->len[t][0]), minLen, maxLen, alphaSize);
}
/*--- Transmit the mapping table. ---*/
{
/* bbox: optimized a bit more than in bzip2 */
int inUse16 = 0;
for (i = 0; i < 16; i++) {
if (sizeof(long) <= 4) {
inUse16 = inUse16*2 +
((*(bb__aliased_uint32_t*)&(s->inUse[i * 16 + 0])
| *(bb__aliased_uint32_t*)&(s->inUse[i * 16 + 4])
| *(bb__aliased_uint32_t*)&(s->inUse[i * 16 + 8])
| *(bb__aliased_uint32_t*)&(s->inUse[i * 16 + 12])) != 0);
} else { /* Our CPU can do better */
inUse16 = inUse16*2 +
((*(bb__aliased_uint64_t*)&(s->inUse[i * 16 + 0])
| *(bb__aliased_uint64_t*)&(s->inUse[i * 16 + 8])) != 0);
}
}
bsW16(s, inUse16);
inUse16 <<= (sizeof(int)*8 - 16); /* move 15th bit into sign bit */
for (i = 0; i < 16; i++) {
if (inUse16 < 0) {
unsigned v16 = 0;
unsigned j;
for (j = 0; j < 16; j++)
v16 = v16*2 + s->inUse[i * 16 + j];
bsW16(s, v16);
}
inUse16 <<= 1;
}
}
/*--- Now the selectors. ---*/
bsW(s, 3, nGroups);
bsW(s, 15, nSelectors);
for (i = 0; i < nSelectors; i++) {
unsigned j;
for (j = 0; j < s->selectorMtf[i]; j++)
bsW1_1(s);
bsW1_0(s);
}
/*--- Now the coding tables. ---*/
for (t = 0; t < nGroups; t++) {
unsigned curr = s->len[t][0];
bsW(s, 5, curr);
for (i = 0; i < alphaSize; i++) {
while (curr < s->len[t][i]) { bsW(s, 2, 2); curr++; /* 10 */ }
while (curr > s->len[t][i]) { bsW(s, 2, 3); curr--; /* 11 */ }
bsW1_0(s);
}
}
/*--- And finally, the block data proper ---*/
selCtr = 0;
gs = 0;
while (1) {
unsigned ge;
if (gs >= s->nMTF)
break;
ge = gs + BZ_G_SIZE - 1;
if (ge >= s->nMTF)
ge = s->nMTF-1;
AssertH(s->selector[selCtr] < nGroups, 3006);
/* Costs 1300 bytes and is _slower_ (on Intel Core 2) */
#if 0
if (nGroups == 6 && 50 == ge-gs+1) {
/*--- fast track the common case ---*/
uint16_t mtfv_i;
uint8_t* s_len_sel_selCtr = &(s->len[s->selector[selCtr]][0]);
int32_t* s_code_sel_selCtr = &(s->code[s->selector[selCtr]][0]);
#define BZ_ITAH(nn) \
mtfv_i = mtfv[gs+(nn)]; \
bsW(s, s_len_sel_selCtr[mtfv_i], s_code_sel_selCtr[mtfv_i])
BZ_ITAH(0); BZ_ITAH(1); BZ_ITAH(2); BZ_ITAH(3); BZ_ITAH(4);
BZ_ITAH(5); BZ_ITAH(6); BZ_ITAH(7); BZ_ITAH(8); BZ_ITAH(9);
BZ_ITAH(10); BZ_ITAH(11); BZ_ITAH(12); BZ_ITAH(13); BZ_ITAH(14);
BZ_ITAH(15); BZ_ITAH(16); BZ_ITAH(17); BZ_ITAH(18); BZ_ITAH(19);
BZ_ITAH(20); BZ_ITAH(21); BZ_ITAH(22); BZ_ITAH(23); BZ_ITAH(24);
BZ_ITAH(25); BZ_ITAH(26); BZ_ITAH(27); BZ_ITAH(28); BZ_ITAH(29);
BZ_ITAH(30); BZ_ITAH(31); BZ_ITAH(32); BZ_ITAH(33); BZ_ITAH(34);
BZ_ITAH(35); BZ_ITAH(36); BZ_ITAH(37); BZ_ITAH(38); BZ_ITAH(39);
BZ_ITAH(40); BZ_ITAH(41); BZ_ITAH(42); BZ_ITAH(43); BZ_ITAH(44);
BZ_ITAH(45); BZ_ITAH(46); BZ_ITAH(47); BZ_ITAH(48); BZ_ITAH(49);
#undef BZ_ITAH
gs = ge+1;
} else
#endif
{
/*--- slow version which correctly handles all situations ---*/
/* code is bit bigger, but moves multiply out of the loop */
uint8_t* s_len_sel_selCtr = &(s->len [s->selector[selCtr]][0]);
int32_t* s_code_sel_selCtr = &(s->code[s->selector[selCtr]][0]);
while (gs <= ge) {
bsW(s,
s_len_sel_selCtr[mtfv[gs]],
s_code_sel_selCtr[mtfv[gs]]
);
gs++;
}
/* already is: gs = ge+1; */
}
selCtr++;
}
AssertH(selCtr == nSelectors, 3007);
#undef code
#undef rfreq
#undef len_pack
}
/*---------------------------------------------------*/
static
void BZ2_compressBlock(EState* s, int is_last_block)
{
int32_t origPtr = origPtr;
if (s->nblock > 0) {
BZ_FINALISE_CRC(s->blockCRC);
s->combinedCRC = (s->combinedCRC << 1) | (s->combinedCRC >> 31);
s->combinedCRC ^= s->blockCRC;
if (s->blockNo > 1)
s->posZ = s->zbits; // was: s->numZ = 0;
origPtr = BZ2_blockSort(s);
}
s->zbits = &((uint8_t*)s->arr2)[s->nblock];
s->posZ = s->zbits;
s->state_out_pos = s->zbits;
/*-- If this is the first block, create the stream header. --*/
if (s->blockNo == 1) {
BZ2_bsInitWrite(s);
/*bsPutU8(s, BZ_HDR_B);*/
/*bsPutU8(s, BZ_HDR_Z);*/
/*bsPutU8(s, BZ_HDR_h);*/
/*bsPutU8(s, BZ_HDR_0 + s->blockSize100k);*/
bsPutU32(s, BZ_HDR_BZh0 + s->blockSize100k);
}
if (s->nblock > 0) {
/*bsPutU8(s, 0x31);*/
/*bsPutU8(s, 0x41);*/
/*bsPutU8(s, 0x59);*/
/*bsPutU8(s, 0x26);*/
bsPutU32(s, 0x31415926);
/*bsPutU8(s, 0x53);*/
/*bsPutU8(s, 0x59);*/
bsPutU16(s, 0x5359);
/*-- Now the block's CRC, so it is in a known place. --*/
bsPutU32(s, s->blockCRC);
/*
* Now a single bit indicating (non-)randomisation.
* As of version 0.9.5, we use a better sorting algorithm
* which makes randomisation unnecessary. So always set
* the randomised bit to 'no'. Of course, the decoder
* still needs to be able to handle randomised blocks
* so as to maintain backwards compatibility with
* older versions of bzip2.
*/
bsW1_0(s);
bsW(s, 24, origPtr);
generateMTFValues(s);
sendMTFValues(s);
}
/*-- If this is the last block, add the stream trailer. --*/
if (is_last_block) {
/*bsPutU8(s, 0x17);*/
/*bsPutU8(s, 0x72);*/
/*bsPutU8(s, 0x45);*/
/*bsPutU8(s, 0x38);*/
bsPutU32(s, 0x17724538);
/*bsPutU8(s, 0x50);*/
/*bsPutU8(s, 0x90);*/
bsPutU16(s, 0x5090);
bsPutU32(s, s->combinedCRC);
bsFinishWrite(s);
}
}
/*-------------------------------------------------------------*/
/*--- end compress.c ---*/
/*-------------------------------------------------------------*/

View File

@ -0,0 +1,229 @@
/*
* bzip2 is written by Julian Seward <jseward@bzip.org>.
* Adapted for busybox by Denys Vlasenko <vda.linux@googlemail.com>.
* See README and LICENSE files in this directory for more information.
*/
/*-------------------------------------------------------------*/
/*--- Huffman coding low-level stuff ---*/
/*--- huffman.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.4 of 20 December 2006
Copyright (C) 1996-2006 Julian Seward <jseward@bzip.org>
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
/* #include "bzlib_private.h" */
/*---------------------------------------------------*/
#define WEIGHTOF(zz0) ((zz0) & 0xffffff00)
#define DEPTHOF(zz1) ((zz1) & 0x000000ff)
#define MYMAX(zz2,zz3) ((zz2) > (zz3) ? (zz2) : (zz3))
#define ADDWEIGHTS(zw1,zw2) \
(WEIGHTOF(zw1)+WEIGHTOF(zw2)) | \
(1 + MYMAX(DEPTHOF(zw1),DEPTHOF(zw2)))
#define UPHEAP(z) \
{ \
int32_t zz, tmp; \
zz = z; \
tmp = heap[zz]; \
while (weight[tmp] < weight[heap[zz >> 1]]) { \
heap[zz] = heap[zz >> 1]; \
zz >>= 1; \
} \
heap[zz] = tmp; \
}
/* 90 bytes, 0.3% of overall compress speed */
#if BZIP2_SPEED >= 1
/* macro works better than inline (gcc 4.2.1) */
#define DOWNHEAP1(heap, weight, Heap) \
{ \
int32_t zz, yy, tmp; \
zz = 1; \
tmp = heap[zz]; \
while (1) { \
yy = zz << 1; \
if (yy > nHeap) \
break; \
if (yy < nHeap \
&& weight[heap[yy+1]] < weight[heap[yy]]) \
yy++; \
if (weight[tmp] < weight[heap[yy]]) \
break; \
heap[zz] = heap[yy]; \
zz = yy; \
} \
heap[zz] = tmp; \
}
#else
static
void DOWNHEAP1(int32_t *heap, int32_t *weight, int32_t nHeap)
{
int32_t zz, yy, tmp;
zz = 1;
tmp = heap[zz];
while (1) {
yy = zz << 1;
if (yy > nHeap)
break;
if (yy < nHeap
&& weight[heap[yy + 1]] < weight[heap[yy]])
yy++;
if (weight[tmp] < weight[heap[yy]])
break;
heap[zz] = heap[yy];
zz = yy;
}
heap[zz] = tmp;
}
#endif
/*---------------------------------------------------*/
static
void BZ2_hbMakeCodeLengths(EState *s,
uint8_t *len,
int32_t *freq,
int32_t alphaSize,
int32_t maxLen)
{
/*
* Nodes and heap entries run from 1. Entry 0
* for both the heap and nodes is a sentinel.
*/
int32_t nNodes, nHeap, n1, n2, i, j, k;
Bool tooLong;
/* bbox: moved to EState to save stack
int32_t heap [BZ_MAX_ALPHA_SIZE + 2];
int32_t weight[BZ_MAX_ALPHA_SIZE * 2];
int32_t parent[BZ_MAX_ALPHA_SIZE * 2];
*/
#define heap (s->BZ2_hbMakeCodeLengths__heap)
#define weight (s->BZ2_hbMakeCodeLengths__weight)
#define parent (s->BZ2_hbMakeCodeLengths__parent)
for (i = 0; i < alphaSize; i++)
weight[i+1] = (freq[i] == 0 ? 1 : freq[i]) << 8;
while (1) {
nNodes = alphaSize;
nHeap = 0;
heap[0] = 0;
weight[0] = 0;
parent[0] = -2;
for (i = 1; i <= alphaSize; i++) {
parent[i] = -1;
nHeap++;
heap[nHeap] = i;
UPHEAP(nHeap);
}
AssertH(nHeap < (BZ_MAX_ALPHA_SIZE+2), 2001);
while (nHeap > 1) {
n1 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP1(heap, weight, nHeap);
n2 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP1(heap, weight, nHeap);
nNodes++;
parent[n1] = parent[n2] = nNodes;
weight[nNodes] = ADDWEIGHTS(weight[n1], weight[n2]);
parent[nNodes] = -1;
nHeap++;
heap[nHeap] = nNodes;
UPHEAP(nHeap);
}
AssertH(nNodes < (BZ_MAX_ALPHA_SIZE * 2), 2002);
tooLong = False;
for (i = 1; i <= alphaSize; i++) {
j = 0;
k = i;
while (parent[k] >= 0) {
k = parent[k];
j++;
}
len[i-1] = j;
if (j > maxLen)
tooLong = True;
}
if (!tooLong)
break;
/* 17 Oct 04: keep-going condition for the following loop used
to be 'i < alphaSize', which missed the last element,
theoretically leading to the possibility of the compressor
looping. However, this count-scaling step is only needed if
one of the generated Huffman code words is longer than
maxLen, which up to and including version 1.0.2 was 20 bits,
which is extremely unlikely. In version 1.0.3 maxLen was
changed to 17 bits, which has minimal effect on compression
ratio, but does mean this scaling step is used from time to
time, enough to verify that it works.
This means that bzip2-1.0.3 and later will only produce
Huffman codes with a maximum length of 17 bits. However, in
order to preserve backwards compatibility with bitstreams
produced by versions pre-1.0.3, the decompressor must still
handle lengths of up to 20. */
for (i = 1; i <= alphaSize; i++) {
j = weight[i] >> 8;
/* bbox: yes, it is a signed division.
* don't replace with shift! */
j = 1 + (j / 2);
weight[i] = j << 8;
}
}
#undef heap
#undef weight
#undef parent
}
/*---------------------------------------------------*/
static
void BZ2_hbAssignCodes(int32_t *code,
uint8_t *length,
int32_t minLen,
int32_t maxLen,
int32_t alphaSize)
{
int32_t n, vec, i;
vec = 0;
for (n = minLen; n <= maxLen; n++) {
for (i = 0; i < alphaSize; i++) {
if (length[i] == n) {
code[i] = vec;
vec++;
}
}
vec <<= 1;
}
}
/*-------------------------------------------------------------*/
/*--- end huffman.c ---*/
/*-------------------------------------------------------------*/

View File

@ -0,0 +1,8 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
const char cpio_TRAILER[] ALIGN1 = "TRAILER!!!";

View File

@ -0,0 +1,14 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC data_align(archive_handle_t *archive_handle, unsigned boundary)
{
unsigned skip_amount = (boundary - (archive_handle->offset % boundary)) % boundary;
archive_handle->seek(archive_handle->src_fd, skip_amount);
archive_handle->offset += skip_amount;
}

View File

@ -0,0 +1,259 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC data_extract_all(archive_handle_t *archive_handle)
{
file_header_t *file_header = archive_handle->file_header;
int dst_fd;
int res;
char *hard_link;
#if ENABLE_FEATURE_TAR_LONG_OPTIONS
char *dst_name;
#else
# define dst_name (file_header->name)
#endif
#if ENABLE_FEATURE_TAR_SELINUX
char *sctx = archive_handle->tar__sctx[PAX_NEXT_FILE];
if (!sctx)
sctx = archive_handle->tar__sctx[PAX_GLOBAL];
if (sctx) { /* setfscreatecon is 4 syscalls, avoid if possible */
setfscreatecon(sctx);
free(archive_handle->tar__sctx[PAX_NEXT_FILE]);
archive_handle->tar__sctx[PAX_NEXT_FILE] = NULL;
}
#endif
/* Hard links are encoded as regular files of size 0
* with a nonempty link field */
hard_link = NULL;
if (S_ISREG(file_header->mode) && file_header->size == 0)
hard_link = file_header->link_target;
#if ENABLE_FEATURE_TAR_LONG_OPTIONS
dst_name = file_header->name;
if (archive_handle->tar__strip_components) {
unsigned n = archive_handle->tar__strip_components;
do {
dst_name = strchr(dst_name, '/');
if (!dst_name || dst_name[1] == '\0') {
data_skip(archive_handle);
goto ret;
}
dst_name++;
/*
* Link target is shortened only for hardlinks:
* softlinks restored unchanged.
*/
if (hard_link) {
// GNU tar 1.26 does not check that we reached end of link name:
// if "dir/hardlink" is hardlinked to "file",
// tar xvf a.tar --strip-components=1 says:
// tar: hardlink: Cannot hard link to '': No such file or directory
// and continues processing. We silently skip such entries.
hard_link = strchr(hard_link, '/');
if (!hard_link || hard_link[1] == '\0') {
data_skip(archive_handle);
goto ret;
}
hard_link++;
}
} while (--n != 0);
}
#endif
if (archive_handle->ah_flags & ARCHIVE_CREATE_LEADING_DIRS) {
char *slash = strrchr(dst_name, '/');
if (slash) {
*slash = '\0';
bb_make_directory(dst_name, -1, FILEUTILS_RECUR);
*slash = '/';
}
}
if (archive_handle->ah_flags & ARCHIVE_UNLINK_OLD) {
/* Remove the entry if it exists */
if (!S_ISDIR(file_header->mode)) {
if (hard_link) {
/* Ugly special case:
* tar cf t.tar hardlink1 hardlink2 hardlink1
* results in this tarball structure:
* hardlink1
* hardlink2 -> hardlink1
* hardlink1 -> hardlink1 <== !!!
*/
if (strcmp(hard_link, dst_name) == 0)
goto ret;
}
/* Proceed with deleting */
if (unlink(dst_name) == -1
&& errno != ENOENT
) {
bb_perror_msg_and_die("can't remove old file %s",
dst_name);
}
}
}
else if (archive_handle->ah_flags & ARCHIVE_EXTRACT_NEWER) {
/* Remove the existing entry if its older than the extracted entry */
struct stat existing_sb;
if (lstat(dst_name, &existing_sb) == -1) {
if (errno != ENOENT) {
bb_simple_perror_msg_and_die("can't stat old file");
}
}
else if (existing_sb.st_mtime >= file_header->mtime) {
if (!S_ISDIR(file_header->mode)) {
bb_error_msg("%s not created: newer or "
"same age file exists", dst_name);
}
data_skip(archive_handle);
goto ret;
}
else if ((unlink(dst_name) == -1) && (errno != EISDIR)) {
bb_perror_msg_and_die("can't remove old file %s",
dst_name);
}
}
/* Handle hard links separately */
if (hard_link) {
create_or_remember_link(&archive_handle->link_placeholders,
hard_link,
dst_name,
1);
/* Hardlinks have no separate mode/ownership, skip chown/chmod */
goto ret;
}
/* Create the filesystem entry */
switch (file_header->mode & S_IFMT) {
case S_IFREG: {
/* Regular file */
char *dst_nameN;
int flags = O_WRONLY | O_CREAT | O_EXCL;
if (archive_handle->ah_flags & ARCHIVE_O_TRUNC)
flags = O_WRONLY | O_CREAT | O_TRUNC;
dst_nameN = dst_name;
#ifdef ARCHIVE_REPLACE_VIA_RENAME
if (archive_handle->ah_flags & ARCHIVE_REPLACE_VIA_RENAME)
/* rpm-style temp file name */
dst_nameN = xasprintf("%s;%x", dst_name, (int)getpid());
#endif
dst_fd = xopen3(dst_nameN,
flags,
file_header->mode
);
bb_copyfd_exact_size(archive_handle->src_fd, dst_fd, file_header->size);
close(dst_fd);
#ifdef ARCHIVE_REPLACE_VIA_RENAME
if (archive_handle->ah_flags & ARCHIVE_REPLACE_VIA_RENAME) {
xrename(dst_nameN, dst_name);
free(dst_nameN);
}
#endif
break;
}
case S_IFDIR:
//TODO: this causes problems if tarball contains a r-xr-xr-x directory:
// we create this directory, and then fail to create files inside it
// (if tar xf isn't run as root).
// GNU tar works around this by chmod-ing directories *after* all files are extracted.
res = mkdir(dst_name, file_header->mode);
if ((res != 0)
&& (errno != EISDIR) /* btw, Linux doesn't return this */
&& (errno != EEXIST)
) {
bb_perror_msg("can't make dir %s", dst_name);
}
break;
case S_IFLNK:
/* Symlink */
//TODO: what if file_header->link_target == NULL (say, corrupted tarball?)
/* To avoid a directory traversal attack via symlinks,
* do not restore symlinks with ".." components
* or symlinks starting with "/", unless a magic
* envvar is set.
*
* For example, consider a .tar created via:
* $ tar cvf bug.tar anything.txt
* $ ln -s /tmp symlink
* $ tar --append -f bug.tar symlink
* $ rm symlink
* $ mkdir symlink
* $ tar --append -f bug.tar symlink/evil.py
*
* This will result in an archive that contains:
* $ tar --list -f bug.tar
* anything.txt
* symlink [-> /tmp]
* symlink/evil.py
*
* Untarring bug.tar would otherwise place evil.py in '/tmp'.
*/
create_or_remember_link(&archive_handle->link_placeholders,
file_header->link_target,
dst_name,
0);
break;
case S_IFSOCK:
case S_IFBLK:
case S_IFCHR:
case S_IFIFO:
res = mknod(dst_name, file_header->mode, file_header->device);
if (res != 0) {
bb_perror_msg("can't create node %s", dst_name);
}
break;
default:
bb_simple_error_msg_and_die("unrecognized file type");
}
if (!S_ISLNK(file_header->mode)) {
if (!(archive_handle->ah_flags & ARCHIVE_DONT_RESTORE_OWNER)) {
uid_t uid = file_header->uid;
gid_t gid = file_header->gid;
#if ENABLE_FEATURE_TAR_UNAME_GNAME
if (!(archive_handle->ah_flags & ARCHIVE_NUMERIC_OWNER)) {
if (file_header->tar__uname) {
//TODO: cache last name/id pair?
struct passwd *pwd = getpwnam(file_header->tar__uname);
if (pwd) uid = pwd->pw_uid;
}
if (file_header->tar__gname) {
struct group *grp = getgrnam(file_header->tar__gname);
if (grp) gid = grp->gr_gid;
}
}
#endif
/* GNU tar 1.15.1 uses chown, not lchown */
chown(dst_name, uid, gid);
}
/* uclibc has no lchmod, glibc is even stranger -
* it has lchmod which seems to do nothing!
* so we use chmod... */
if (!(archive_handle->ah_flags & ARCHIVE_DONT_RESTORE_PERM)) {
chmod(dst_name, file_header->mode);
}
if (archive_handle->ah_flags & ARCHIVE_RESTORE_DATE) {
struct timeval t[2];
t[1].tv_sec = t[0].tv_sec = file_header->mtime;
t[1].tv_usec = t[0].tv_usec = 0;
utimes(dst_name, t);
}
}
ret: ;
#if ENABLE_FEATURE_TAR_SELINUX
if (sctx) {
/* reset the context after creating an entry */
setfscreatecon(NULL);
}
#endif
}

View File

@ -0,0 +1,136 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
enum {
//TAR_FILETYPE,
TAR_MODE,
TAR_FILENAME,
TAR_REALNAME,
#if ENABLE_FEATURE_TAR_UNAME_GNAME
TAR_UNAME,
TAR_GNAME,
#endif
TAR_SIZE,
TAR_UID,
TAR_GID,
TAR_MAX,
};
static const char *const tar_var[] ALIGN_PTR = {
// "FILETYPE",
"MODE",
"FILENAME",
"REALNAME",
#if ENABLE_FEATURE_TAR_UNAME_GNAME
"UNAME",
"GNAME",
#endif
"SIZE",
"UID",
"GID",
};
static void xputenv(char *str)
{
if (putenv(str))
bb_die_memory_exhausted();
}
static void str2env(char *env[], int idx, const char *str)
{
env[idx] = xasprintf("TAR_%s=%s", tar_var[idx], str);
xputenv(env[idx]);
}
static void dec2env(char *env[], int idx, unsigned long long val)
{
env[idx] = xasprintf("TAR_%s=%llu", tar_var[idx], val);
xputenv(env[idx]);
}
static void oct2env(char *env[], int idx, unsigned long val)
{
env[idx] = xasprintf("TAR_%s=%lo", tar_var[idx], val);
xputenv(env[idx]);
}
void FAST_FUNC data_extract_to_command(archive_handle_t *archive_handle)
{
file_header_t *file_header = archive_handle->file_header;
#if 0 /* do we need this? ENABLE_FEATURE_TAR_SELINUX */
char *sctx = archive_handle->tar__sctx[PAX_NEXT_FILE];
if (!sctx)
sctx = archive_handle->tar__sctx[PAX_GLOBAL];
if (sctx) { /* setfscreatecon is 4 syscalls, avoid if possible */
setfscreatecon(sctx);
free(archive_handle->tar__sctx[PAX_NEXT_FILE]);
archive_handle->tar__sctx[PAX_NEXT_FILE] = NULL;
}
#endif
if ((file_header->mode & S_IFMT) == S_IFREG) {
pid_t pid;
int p[2], status;
char *tar_env[TAR_MAX];
memset(tar_env, 0, sizeof(tar_env));
xpipe(p);
pid = BB_MMU ? xfork() : xvfork();
if (pid == 0) {
/* Child */
/* str2env(tar_env, TAR_FILETYPE, "f"); - parent should do it once */
oct2env(tar_env, TAR_MODE, file_header->mode);
str2env(tar_env, TAR_FILENAME, file_header->name);
str2env(tar_env, TAR_REALNAME, file_header->name);
#if ENABLE_FEATURE_TAR_UNAME_GNAME
str2env(tar_env, TAR_UNAME, file_header->tar__uname);
str2env(tar_env, TAR_GNAME, file_header->tar__gname);
#endif
dec2env(tar_env, TAR_SIZE, file_header->size);
dec2env(tar_env, TAR_UID, file_header->uid);
dec2env(tar_env, TAR_GID, file_header->gid);
close(p[1]);
xdup2(p[0], STDIN_FILENO);
signal(SIGPIPE, SIG_DFL);
execl(archive_handle->tar__to_command_shell,
archive_handle->tar__to_command_shell,
"-c",
archive_handle->tar__to_command,
(char *)0);
bb_perror_msg_and_die("can't execute '%s'", archive_handle->tar__to_command_shell);
}
close(p[0]);
/* Our caller is expected to do signal(SIGPIPE, SIG_IGN)
* so that we don't die if child don't read all the input: */
bb_copyfd_exact_size(archive_handle->src_fd, p[1], -file_header->size);
close(p[1]);
status = wait_for_exitstatus(pid);
if (WIFEXITED(status) && WEXITSTATUS(status))
bb_error_msg_and_die("'%s' returned status %d",
archive_handle->tar__to_command, WEXITSTATUS(status));
if (WIFSIGNALED(status))
bb_error_msg_and_die("'%s' terminated by signal %d",
archive_handle->tar__to_command, WTERMSIG(status));
if (!BB_MMU) {
int i;
for (i = 0; i < TAR_MAX; i++) {
if (tar_env[i])
bb_unsetenv_and_free(tar_env[i]);
}
}
}
#if 0 /* ENABLE_FEATURE_TAR_SELINUX */
if (sctx)
/* reset the context after creating an entry */
setfscreatecon(NULL);
#endif
}

View File

@ -0,0 +1,13 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC data_extract_to_stdout(archive_handle_t *archive_handle)
{
bb_copyfd_exact_size(archive_handle->src_fd,
STDOUT_FILENO,
archive_handle->file_header->size);
}

View File

@ -0,0 +1,11 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC data_skip(archive_handle_t *archive_handle)
{
archive_handle->seek(archive_handle->src_fd, archive_handle->file_header->size);
}

View File

@ -0,0 +1,900 @@
/* vi: set sw=4 ts=4: */
/*
* Small bzip2 deflate implementation, by Rob Landley (rob@landley.net).
*
* Based on bzip2 decompression code by Julian R Seward (jseward@acm.org),
* which also acknowledges contributions by Mike Burrows, David Wheeler,
* Peter Fenwick, Alistair Moffat, Radford Neal, Ian H. Witten,
* Robert Sedgewick, and Jon L. Bentley.
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
/*
Size and speed optimizations by Manuel Novoa III (mjn3@codepoet.org).
More efficient reading of Huffman codes, a streamlined read_bunzip()
function, and various other tweaks. In (limited) tests, approximately
20% faster than bzcat on x86 and about 10% faster on arm.
Note that about 2/3 of the time is spent in read_bunzip() reversing
the Burrows-Wheeler transformation. Much of that time is delay
resulting from cache misses.
(2010 update by vda: profiled "bzcat <84mbyte.bz2 >/dev/null"
on x86-64 CPU with L2 > 1M: get_next_block is hotter than read_bunzip:
%time seconds calls function
71.01 12.69 444 get_next_block
28.65 5.12 93065 read_bunzip
00.22 0.04 7736490 get_bits
00.11 0.02 47 dealloc_bunzip
00.00 0.00 93018 full_write
...)
I would ask that anyone benefiting from this work, especially those
using it in commercial products, consider making a donation to my local
non-profit hospice organization (www.hospiceacadiana.com) in the name of
the woman I loved, Toni W. Hagan, who passed away Feb. 12, 2003.
Manuel
*/
#include "libbb.h"
#include "bb_archive.h"
#if 0
# define dbg(...) bb_error_msg(__VA_ARGS__)
#else
# define dbg(...) ((void)0)
#endif
/* Constants for Huffman coding */
#define MAX_GROUPS 6
#define GROUP_SIZE 50 /* 64 would have been more efficient */
#define MAX_HUFCODE_BITS 20 /* Longest Huffman code allowed */
#define MAX_SYMBOLS 258 /* 256 literals + RUNA + RUNB */
#define SYMBOL_RUNA 0
#define SYMBOL_RUNB 1
/* Status return values */
#define RETVAL_OK 0
#define RETVAL_LAST_BLOCK (dbg("%d", __LINE__), -1)
#define RETVAL_NOT_BZIP_DATA (dbg("%d", __LINE__), -2)
#define RETVAL_UNEXPECTED_INPUT_EOF (dbg("%d", __LINE__), -3)
#define RETVAL_SHORT_WRITE (dbg("%d", __LINE__), -4)
#define RETVAL_DATA_ERROR (dbg("%d", __LINE__), -5)
#define RETVAL_OUT_OF_MEMORY (dbg("%d", __LINE__), -6)
#define RETVAL_OBSOLETE_INPUT (dbg("%d", __LINE__), -7)
/* Other housekeeping constants */
#define IOBUF_SIZE 4096
/* This is what we know about each Huffman coding group */
struct group_data {
/* We have an extra slot at the end of limit[] for a sentinel value. */
int limit[MAX_HUFCODE_BITS+1], base[MAX_HUFCODE_BITS], permute[MAX_SYMBOLS];
int minLen, maxLen;
};
/* Structure holding all the housekeeping data, including IO buffers and
* memory that persists between calls to bunzip
* Found the most used member:
* cat this_file.c | sed -e 's/"/ /g' -e "s/'/ /g" | xargs -n1 \
* | grep 'bd->' | sed 's/^.*bd->/bd->/' | sort | $PAGER
* and moved it (inbufBitCount) to offset 0.
*/
struct bunzip_data {
/* I/O tracking data (file handles, buffers, positions, etc.) */
unsigned inbufBitCount, inbufBits;
int in_fd, out_fd, inbufCount, inbufPos /*, outbufPos*/;
uint8_t *inbuf /*,*outbuf*/;
/* State for interrupting output loop */
int writeCopies, writePos, writeRunCountdown, writeCount;
int writeCurrent; /* actually a uint8_t */
/* The CRC values stored in the block header and calculated from the data */
uint32_t headerCRC, totalCRC, writeCRC;
/* Intermediate buffer and its size (in bytes) */
uint32_t *dbuf;
unsigned dbufSize;
/* For I/O error handling */
jmp_buf *jmpbuf;
/* Big things go last (register-relative addressing can be larger for big offsets) */
uint32_t crc32Table[256];
uint8_t selectors[32768]; /* nSelectors=15 bits */
struct group_data groups[MAX_GROUPS]; /* Huffman coding tables */
};
typedef struct bunzip_data bunzip_data;
/* Return the next nnn bits of input. All reads from the compressed input
are done through this function. All reads are big endian */
static unsigned get_bits(bunzip_data *bd, int bits_wanted)
{
unsigned bits = 0;
/* Cache bd->inbufBitCount in a CPU register (hopefully): */
int bit_count = bd->inbufBitCount;
/* If we need to get more data from the byte buffer, do so. (Loop getting
one byte at a time to enforce endianness and avoid unaligned access.) */
while (bit_count < bits_wanted) {
/* If we need to read more data from file into byte buffer, do so */
if (bd->inbufPos == bd->inbufCount) {
/* if "no input fd" case: in_fd == -1, read fails, we jump */
bd->inbufCount = read(bd->in_fd, bd->inbuf, IOBUF_SIZE);
if (bd->inbufCount <= 0)
longjmp(*bd->jmpbuf, RETVAL_UNEXPECTED_INPUT_EOF);
bd->inbufPos = 0;
}
/* Avoid 32-bit overflow (dump bit buffer to top of output) */
if (bit_count >= 24) {
bits = bd->inbufBits & ((1U << bit_count) - 1);
bits_wanted -= bit_count;
bits <<= bits_wanted;
bit_count = 0;
}
/* Grab next 8 bits of input from buffer. */
bd->inbufBits = (bd->inbufBits << 8) | bd->inbuf[bd->inbufPos++];
bit_count += 8;
}
/* Calculate result */
bit_count -= bits_wanted;
bd->inbufBitCount = bit_count;
bits |= (bd->inbufBits >> bit_count) & ((1 << bits_wanted) - 1);
return bits;
}
//#define get_bits(bd, n) (dbg("%d:get_bits()", __LINE__), get_bits(bd, n))
/* Unpacks the next block and sets up for the inverse Burrows-Wheeler step. */
static int get_next_block(bunzip_data *bd)
{
int groupCount, selector,
i, j, symCount, symTotal, nSelectors, byteCount[256];
uint8_t uc, symToByte[256], mtfSymbol[256], *selectors;
uint32_t *dbuf;
unsigned origPtr, t;
unsigned dbufCount, runPos;
unsigned runCnt = runCnt; /* for compiler */
dbuf = bd->dbuf;
selectors = bd->selectors;
/* In bbox, we are ok with aborting through setjmp which is set up in start_bunzip */
#if 0
/* Reset longjmp I/O error handling */
i = setjmp(bd->jmpbuf);
if (i) return i;
#endif
/* Read in header signature and CRC, then validate signature.
(last block signature means CRC is for whole file, return now) */
i = get_bits(bd, 24);
j = get_bits(bd, 24);
bd->headerCRC = get_bits(bd, 32);
if ((i == 0x177245) && (j == 0x385090))
return RETVAL_LAST_BLOCK;
if ((i != 0x314159) || (j != 0x265359))
return RETVAL_NOT_BZIP_DATA;
/* We can add support for blockRandomised if anybody complains. There was
some code for this in busybox 1.0.0-pre3, but nobody ever noticed that
it didn't actually work. */
if (get_bits(bd, 1))
return RETVAL_OBSOLETE_INPUT;
origPtr = get_bits(bd, 24);
if (origPtr > bd->dbufSize)
return RETVAL_DATA_ERROR;
/* mapping table: if some byte values are never used (encoding things
like ascii text), the compression code removes the gaps to have fewer
symbols to deal with, and writes a sparse bitfield indicating which
values were present. We make a translation table to convert the symbols
back to the corresponding bytes. */
symTotal = 0;
i = 0;
t = get_bits(bd, 16);
do {
if (t & (1 << 15)) {
unsigned inner_map = get_bits(bd, 16);
do {
if (inner_map & (1 << 15))
symToByte[symTotal++] = i;
inner_map <<= 1;
i++;
} while (i & 15);
i -= 16;
}
t <<= 1;
i += 16;
} while (i < 256);
/* How many different Huffman coding groups does this block use? */
groupCount = get_bits(bd, 3);
if (groupCount < 2 || groupCount > MAX_GROUPS)
return RETVAL_DATA_ERROR;
/* nSelectors: Every GROUP_SIZE many symbols we select a new Huffman coding
group. Read in the group selector list, which is stored as MTF encoded
bit runs. (MTF=Move To Front, as each value is used it's moved to the
start of the list.) */
for (i = 0; i < groupCount; i++)
mtfSymbol[i] = i;
nSelectors = get_bits(bd, 15);
if (!nSelectors)
return RETVAL_DATA_ERROR;
for (i = 0; i < nSelectors; i++) {
uint8_t tmp_byte;
/* Get next value */
int n = 0;
while (get_bits(bd, 1)) {
n++;
if (n >= groupCount)
return RETVAL_DATA_ERROR;
}
/* Decode MTF to get the next selector */
tmp_byte = mtfSymbol[n];
while (--n >= 0)
mtfSymbol[n + 1] = mtfSymbol[n];
//We catch it later, in the second loop where we use selectors[i].
//Maybe this is a better place, though?
// if (tmp_byte >= groupCount) {
// dbg("%d: selectors[%d]:%d groupCount:%d",
// __LINE__, i, tmp_byte, groupCount);
// return RETVAL_DATA_ERROR;
// }
mtfSymbol[0] = selectors[i] = tmp_byte;
}
/* Read the Huffman coding tables for each group, which code for symTotal
literal symbols, plus two run symbols (RUNA, RUNB) */
symCount = symTotal + 2;
for (j = 0; j < groupCount; j++) {
uint8_t length[MAX_SYMBOLS];
/* 8 bits is ALMOST enough for temp[], see below */
unsigned temp[MAX_HUFCODE_BITS+1];
struct group_data *hufGroup;
int *base, *limit;
int minLen, maxLen, pp, len_m1;
/* Read Huffman code lengths for each symbol. They're stored in
a way similar to mtf; record a starting value for the first symbol,
and an offset from the previous value for every symbol after that.
(Subtracting 1 before the loop and then adding it back at the end is
an optimization that makes the test inside the loop simpler: symbol
length 0 becomes negative, so an unsigned inequality catches it.) */
len_m1 = get_bits(bd, 5) - 1;
for (i = 0; i < symCount; i++) {
for (;;) {
int two_bits;
if ((unsigned)len_m1 > (MAX_HUFCODE_BITS-1))
return RETVAL_DATA_ERROR;
/* If first bit is 0, stop. Else second bit indicates whether
to increment or decrement the value. Optimization: grab 2
bits and unget the second if the first was 0. */
two_bits = get_bits(bd, 2);
if (two_bits < 2) {
bd->inbufBitCount++;
break;
}
/* Add one if second bit 1, else subtract 1. Avoids if/else */
len_m1 += (((two_bits+1) & 2) - 1);
}
/* Correct for the initial -1, to get the final symbol length */
length[i] = len_m1 + 1;
}
/* Find largest and smallest lengths in this group */
minLen = maxLen = length[0];
for (i = 1; i < symCount; i++) {
if (length[i] > maxLen)
maxLen = length[i];
else if (length[i] < minLen)
minLen = length[i];
}
/* Calculate permute[], base[], and limit[] tables from length[].
*
* permute[] is the lookup table for converting Huffman coded symbols
* into decoded symbols. base[] is the amount to subtract from the
* value of a Huffman symbol of a given length when using permute[].
*
* limit[] indicates the largest numerical value a symbol with a given
* number of bits can have. This is how the Huffman codes can vary in
* length: each code with a value>limit[length] needs another bit.
*/
hufGroup = bd->groups + j;
hufGroup->minLen = minLen;
hufGroup->maxLen = maxLen;
/* Note that minLen can't be smaller than 1, so we adjust the base
and limit array pointers so we're not always wasting the first
entry. We do this again when using them (during symbol decoding). */
base = hufGroup->base - 1;
limit = hufGroup->limit - 1;
/* Calculate permute[]. Concurrently, initialize temp[] and limit[]. */
pp = 0;
for (i = minLen; i <= maxLen; i++) {
int k;
temp[i] = limit[i] = 0;
for (k = 0; k < symCount; k++)
if (length[k] == i)
hufGroup->permute[pp++] = k;
}
/* Count symbols coded for at each bit length */
/* NB: in pathological cases, temp[8] can end ip being 256.
* That's why uint8_t is too small for temp[]. */
for (i = 0; i < symCount; i++)
temp[length[i]]++;
/* Calculate limit[] (the largest symbol-coding value at each bit
* length, which is (previous limit<<1)+symbols at this level), and
* base[] (number of symbols to ignore at each bit length, which is
* limit minus the cumulative count of symbols coded for already). */
pp = t = 0;
for (i = minLen; i < maxLen;) {
unsigned temp_i = temp[i];
pp += temp_i;
/* We read the largest possible symbol size and then unget bits
after determining how many we need, and those extra bits could
be set to anything. (They're noise from future symbols.) At
each level we're really only interested in the first few bits,
so here we set all the trailing to-be-ignored bits to 1 so they
don't affect the value>limit[length] comparison. */
limit[i] = (pp << (maxLen - i)) - 1;
pp <<= 1;
t += temp_i;
base[++i] = pp - t;
}
limit[maxLen] = pp + temp[maxLen] - 1;
limit[maxLen+1] = INT_MAX; /* Sentinel value for reading next sym. */
base[minLen] = 0;
}
/* We've finished reading and digesting the block header. Now read this
block's Huffman coded symbols from the file and undo the Huffman coding
and run length encoding, saving the result into dbuf[dbufCount++] = uc */
/* Initialize symbol occurrence counters and symbol Move To Front table */
/*memset(byteCount, 0, sizeof(byteCount)); - smaller, but slower */
for (i = 0; i < 256; i++) {
byteCount[i] = 0;
mtfSymbol[i] = (uint8_t)i;
}
/* Loop through compressed symbols. */
runPos = dbufCount = selector = 0;
for (;;) {
struct group_data *hufGroup;
int *base, *limit;
int nextSym;
uint8_t ngrp;
/* Fetch next Huffman coding group from list. */
symCount = GROUP_SIZE - 1;
if (selector >= nSelectors)
return RETVAL_DATA_ERROR;
ngrp = selectors[selector++];
if (ngrp >= groupCount) {
dbg("%d selectors[%d]:%d groupCount:%d",
__LINE__, selector-1, ngrp, groupCount);
return RETVAL_DATA_ERROR;
}
hufGroup = bd->groups + ngrp;
base = hufGroup->base - 1;
limit = hufGroup->limit - 1;
continue_this_group:
/* Read next Huffman-coded symbol. */
/* Note: It is far cheaper to read maxLen bits and back up than it is
to read minLen bits and then add additional bit at a time, testing
as we go. Because there is a trailing last block (with file CRC),
there is no danger of the overread causing an unexpected EOF for a
valid compressed file.
*/
if (1) {
/* As a further optimization, we do the read inline
(falling back to a call to get_bits if the buffer runs dry).
*/
int new_cnt;
while ((new_cnt = bd->inbufBitCount - hufGroup->maxLen) < 0) {
/* bd->inbufBitCount < hufGroup->maxLen */
if (bd->inbufPos == bd->inbufCount) {
nextSym = get_bits(bd, hufGroup->maxLen);
goto got_huff_bits;
}
bd->inbufBits = (bd->inbufBits << 8) | bd->inbuf[bd->inbufPos++];
bd->inbufBitCount += 8;
};
bd->inbufBitCount = new_cnt; /* "bd->inbufBitCount -= hufGroup->maxLen;" */
nextSym = (bd->inbufBits >> new_cnt) & ((1 << hufGroup->maxLen) - 1);
got_huff_bits: ;
} else { /* unoptimized equivalent */
nextSym = get_bits(bd, hufGroup->maxLen);
}
/* Figure how many bits are in next symbol and unget extras */
i = hufGroup->minLen;
while (nextSym > limit[i])
++i;
j = hufGroup->maxLen - i;
if (j < 0)
return RETVAL_DATA_ERROR;
bd->inbufBitCount += j;
/* Huffman decode value to get nextSym (with bounds checking) */
nextSym = (nextSym >> j) - base[i];
if ((unsigned)nextSym >= MAX_SYMBOLS)
return RETVAL_DATA_ERROR;
nextSym = hufGroup->permute[nextSym];
/* We have now decoded the symbol, which indicates either a new literal
byte, or a repeated run of the most recent literal byte. First,
check if nextSym indicates a repeated run, and if so loop collecting
how many times to repeat the last literal. */
if ((unsigned)nextSym <= SYMBOL_RUNB) { /* RUNA or RUNB */
/* If this is the start of a new run, zero out counter */
if (runPos == 0) {
runPos = 1;
runCnt = 0;
}
/* Neat trick that saves 1 symbol: instead of or-ing 0 or 1 at
each bit position, add 1 or 2 instead. For example,
1011 is 1<<0 + 1<<1 + 2<<2. 1010 is 2<<0 + 2<<1 + 1<<2.
You can make any bit pattern that way using 1 less symbol than
the basic or 0/1 method (except all bits 0, which would use no
symbols, but a run of length 0 doesn't mean anything in this
context). Thus space is saved. */
runCnt += (runPos << nextSym); /* +runPos if RUNA; +2*runPos if RUNB */
//The 32-bit overflow of runCnt wasn't yet seen, but probably can happen.
//This would be the fix (catches too large count way before it can overflow):
// if (runCnt > bd->dbufSize) {
// dbg("runCnt:%u > dbufSize:%u RETVAL_DATA_ERROR",
// runCnt, bd->dbufSize);
// return RETVAL_DATA_ERROR;
// }
if (runPos < bd->dbufSize) runPos <<= 1;
goto end_of_huffman_loop;
}
/* When we hit the first non-run symbol after a run, we now know
how many times to repeat the last literal, so append that many
copies to our buffer of decoded symbols (dbuf) now. (The last
literal used is the one at the head of the mtfSymbol array.) */
if (runPos != 0) {
uint8_t tmp_byte;
if (dbufCount + runCnt > bd->dbufSize) {
dbg("dbufCount:%u+runCnt:%u %u > dbufSize:%u RETVAL_DATA_ERROR",
dbufCount, runCnt, dbufCount + runCnt, bd->dbufSize);
return RETVAL_DATA_ERROR;
}
tmp_byte = symToByte[mtfSymbol[0]];
byteCount[tmp_byte] += runCnt;
while ((int)--runCnt >= 0)
dbuf[dbufCount++] = (uint32_t)tmp_byte;
runPos = 0;
}
/* Is this the terminating symbol? */
if (nextSym > symTotal) break;
/* At this point, nextSym indicates a new literal character. Subtract
one to get the position in the MTF array at which this literal is
currently to be found. (Note that the result can't be -1 or 0,
because 0 and 1 are RUNA and RUNB. But another instance of the
first symbol in the mtf array, position 0, would have been handled
as part of a run above. Therefore 1 unused mtf position minus
2 non-literal nextSym values equals -1.) */
if (dbufCount >= bd->dbufSize) return RETVAL_DATA_ERROR;
i = nextSym - 1;
uc = mtfSymbol[i];
/* Adjust the MTF array. Since we typically expect to move only a
* small number of symbols, and are bound by 256 in any case, using
* memmove here would typically be bigger and slower due to function
* call overhead and other assorted setup costs. */
do {
mtfSymbol[i] = mtfSymbol[i-1];
} while (--i);
mtfSymbol[0] = uc;
uc = symToByte[uc];
/* We have our literal byte. Save it into dbuf. */
byteCount[uc]++;
dbuf[dbufCount++] = (uint32_t)uc;
/* Skip group initialization if we're not done with this group. Done
* this way to avoid compiler warning. */
end_of_huffman_loop:
if (--symCount >= 0) goto continue_this_group;
}
/* At this point, we've read all the Huffman-coded symbols (and repeated
runs) for this block from the input stream, and decoded them into the
intermediate buffer. There are dbufCount many decoded bytes in dbuf[].
Now undo the Burrows-Wheeler transform on dbuf.
See http://dogma.net/markn/articles/bwt/bwt.htm
*/
/* Turn byteCount into cumulative occurrence counts of 0 to n-1. */
j = 0;
for (i = 0; i < 256; i++) {
int tmp_count = j + byteCount[i];
byteCount[i] = j;
j = tmp_count;
}
/* Figure out what order dbuf would be in if we sorted it. */
for (i = 0; i < dbufCount; i++) {
uint8_t tmp_byte = (uint8_t)dbuf[i];
int tmp_count = byteCount[tmp_byte];
dbuf[tmp_count] |= (i << 8);
byteCount[tmp_byte] = tmp_count + 1;
}
/* Decode first byte by hand to initialize "previous" byte. Note that it
doesn't get output, and if the first three characters are identical
it doesn't qualify as a run (hence writeRunCountdown=5). */
if (dbufCount) {
uint32_t tmp;
if ((int)origPtr >= dbufCount) return RETVAL_DATA_ERROR;
tmp = dbuf[origPtr];
bd->writeCurrent = (uint8_t)tmp;
bd->writePos = (tmp >> 8);
bd->writeRunCountdown = 5;
}
bd->writeCount = dbufCount;
return RETVAL_OK;
}
/* Undo Burrows-Wheeler transform on intermediate buffer to produce output.
If start_bunzip was initialized with out_fd=-1, then up to len bytes of
data are written to outbuf. Return value is number of bytes written or
error (all errors are negative numbers). If out_fd!=-1, outbuf and len
are ignored, data is written to out_fd and return is RETVAL_OK or error.
NB: read_bunzip returns < 0 on error, or the number of *unfilled* bytes
in outbuf. IOW: on EOF returns len ("all bytes are not filled"), not 0.
(Why? This allows to get rid of one local variable)
*/
static int read_bunzip(bunzip_data *bd, char *outbuf, int len)
{
const uint32_t *dbuf;
int pos, current, previous;
uint32_t CRC;
/* If we already have error/end indicator, return it */
if (bd->writeCount < 0)
return bd->writeCount;
dbuf = bd->dbuf;
/* Register-cached state (hopefully): */
pos = bd->writePos;
current = bd->writeCurrent;
CRC = bd->writeCRC; /* small loss on x86-32 (not enough regs), win on x86-64 */
/* We will always have pending decoded data to write into the output
buffer unless this is the very first call (in which case we haven't
Huffman-decoded a block into the intermediate buffer yet). */
if (bd->writeCopies) {
dec_writeCopies:
/* Inside the loop, writeCopies means extra copies (beyond 1) */
--bd->writeCopies;
/* Loop outputting bytes */
for (;;) {
/* If the output buffer is full, save cached state and return */
if (--len < 0) {
/* Unlikely branch.
* Use of "goto" instead of keeping code here
* helps compiler to realize this. */
goto outbuf_full;
}
/* Write next byte into output buffer, updating CRC */
*outbuf++ = current;
CRC = (CRC << 8) ^ bd->crc32Table[(CRC >> 24) ^ current];
/* Loop now if we're outputting multiple copies of this byte */
if (bd->writeCopies) {
/* Unlikely branch */
/*--bd->writeCopies;*/
/*continue;*/
/* Same, but (ab)using other existing --writeCopies operation
* (and this if() compiles into just test+branch pair): */
goto dec_writeCopies;
}
decode_next_byte:
if (--bd->writeCount < 0)
break; /* input block is fully consumed, need next one */
/* Follow sequence vector to undo Burrows-Wheeler transform */
previous = current;
pos = dbuf[pos];
current = (uint8_t)pos;
pos >>= 8;
/* After 3 consecutive copies of the same byte, the 4th
* is a repeat count. We count down from 4 instead
* of counting up because testing for non-zero is faster */
if (--bd->writeRunCountdown != 0) {
if (current != previous)
bd->writeRunCountdown = 4;
} else {
/* Unlikely branch */
/* We have a repeated run, this byte indicates the count */
bd->writeCopies = current;
current = previous;
bd->writeRunCountdown = 5;
/* Sometimes there are just 3 bytes (run length 0) */
if (!bd->writeCopies) goto decode_next_byte;
/* Subtract the 1 copy we'd output anyway to get extras */
--bd->writeCopies;
}
} /* for (;;) */
/* Decompression of this input block completed successfully */
bd->writeCRC = CRC = ~CRC;
bd->totalCRC = ((bd->totalCRC << 1) | (bd->totalCRC >> 31)) ^ CRC;
/* If this block had a CRC error, force file level CRC error */
if (CRC != bd->headerCRC) {
bd->totalCRC = bd->headerCRC + 1;
return RETVAL_LAST_BLOCK;
}
}
/* Refill the intermediate buffer by Huffman-decoding next block of input */
{
int r = get_next_block(bd);
if (r) { /* error/end */
bd->writeCount = r;
return (r != RETVAL_LAST_BLOCK) ? r : len;
}
}
CRC = ~0;
pos = bd->writePos;
current = bd->writeCurrent;
goto decode_next_byte;
outbuf_full:
/* Output buffer is full, save cached state and return */
bd->writePos = pos;
bd->writeCurrent = current;
bd->writeCRC = CRC;
bd->writeCopies++;
return 0;
}
/* Allocate the structure, read file header. If in_fd==-1, inbuf must contain
a complete bunzip file (len bytes long). If in_fd!=-1, inbuf and len are
ignored, and data is read from file handle into temporary buffer. */
/* Because bunzip2 is used for help text unpacking, and because bb_show_usage()
should work for NOFORK applets too, we must be extremely careful to not leak
any allocations! */
static int FAST_FUNC start_bunzip(
void *jmpbuf,
bunzip_data **bdp,
int in_fd,
const void *inbuf, int len)
{
bunzip_data *bd;
unsigned i;
enum {
BZh0 = ('B' << 24) + ('Z' << 16) + ('h' << 8) + '0',
h0 = ('h' << 8) + '0',
};
/* Figure out how much data to allocate */
i = sizeof(bunzip_data);
if (in_fd != -1)
i += IOBUF_SIZE;
/* Allocate bunzip_data. Most fields initialize to zero. */
bd = *bdp = xzalloc(i);
bd->jmpbuf = jmpbuf;
/* Setup input buffer */
bd->in_fd = in_fd;
if (-1 == in_fd) {
/* in this case, bd->inbuf is read-only */
bd->inbuf = (void*)inbuf; /* cast away const-ness */
} else {
bd->inbuf = (uint8_t*)(bd + 1);
memcpy(bd->inbuf, inbuf, len);
}
bd->inbufCount = len;
/* Init the CRC32 table (big endian) */
crc32_filltable(bd->crc32Table, 1);
/* Ensure that file starts with "BZh['1'-'9']." */
/* Update: now caller verifies 1st two bytes, makes .gz/.bz2
* integration easier */
/* was: */
/* i = get_bits(bd, 32); */
/* if ((unsigned)(i - BZh0 - 1) >= 9) return RETVAL_NOT_BZIP_DATA; */
i = get_bits(bd, 16);
if ((unsigned)(i - h0 - 1) >= 9) return RETVAL_NOT_BZIP_DATA;
/* Fourth byte (ascii '1'-'9') indicates block size in units of 100k of
uncompressed data. Allocate intermediate buffer for block. */
/* bd->dbufSize = 100000 * (i - BZh0); */
bd->dbufSize = 100000 * (i - h0);
/* Cannot use xmalloc - may leak bd in NOFORK case! */
bd->dbuf = malloc_or_warn(bd->dbufSize * sizeof(bd->dbuf[0]));
if (!bd->dbuf) {
free(bd);
xfunc_die();
}
return RETVAL_OK;
}
static void FAST_FUNC dealloc_bunzip(bunzip_data *bd)
{
free(bd->dbuf);
free(bd);
}
/* Decompress src_fd to dst_fd. Stops at end of bzip data, not end of file. */
IF_DESKTOP(long long) int FAST_FUNC
unpack_bz2_stream(transformer_state_t *xstate)
{
IF_DESKTOP(long long total_written = 0;)
bunzip_data *bd;
char *outbuf;
int i;
unsigned len;
if (check_signature16(xstate, BZIP2_MAGIC))
return -1;
outbuf = xmalloc(IOBUF_SIZE);
len = 0;
while (1) { /* "Process one BZ... stream" loop */
jmp_buf jmpbuf;
/* Setup for I/O error handling via longjmp */
i = setjmp(jmpbuf);
if (i == 0)
i = start_bunzip(&jmpbuf, &bd, xstate->src_fd, outbuf + 2, len);
if (i == 0) {
while (1) { /* "Produce some output bytes" loop */
i = read_bunzip(bd, outbuf, IOBUF_SIZE);
if (i < 0) /* error? */
break;
i = IOBUF_SIZE - i; /* number of bytes produced */
if (i == 0) /* EOF? */
break;
if (i != transformer_write(xstate, outbuf, i)) {
i = RETVAL_SHORT_WRITE;
goto release_mem;
}
IF_DESKTOP(total_written += i;)
}
}
if (i != RETVAL_LAST_BLOCK
/* Observed case when i == RETVAL_OK:
* "bzcat z.bz2", where "z.bz2" is a bzipped zero-length file
* (to be exact, z.bz2 is exactly these 14 bytes:
* 42 5a 68 39 17 72 45 38 50 90 00 00 00 00).
*/
&& i != RETVAL_OK
) {
bb_error_msg("bunzip error %d", i);
break;
}
if (bd->headerCRC != bd->totalCRC) {
bb_simple_error_msg("CRC error");
break;
}
/* Successfully unpacked one BZ stream */
i = RETVAL_OK;
/* Do we have "BZ..." after last processed byte?
* pbzip2 (parallelized bzip2) produces such files.
*/
len = bd->inbufCount - bd->inbufPos;
memcpy(outbuf, &bd->inbuf[bd->inbufPos], len);
if (len < 2) {
if (safe_read(xstate->src_fd, outbuf + len, 2 - len) != 2 - len)
break;
len = 2;
}
if (*(uint16_t*)outbuf != BZIP2_MAGIC) /* "BZ"? */
break;
dealloc_bunzip(bd);
len -= 2;
}
release_mem:
dealloc_bunzip(bd);
free(outbuf);
return i ? i : IF_DESKTOP(total_written) + 0;
}
char* FAST_FUNC
unpack_bz2_data(const char *packed, int packed_len, int unpacked_len)
{
char *outbuf = NULL;
bunzip_data *bd;
int i;
jmp_buf jmpbuf;
/* Setup for I/O error handling via longjmp */
i = setjmp(jmpbuf);
if (i == 0) {
i = start_bunzip(&jmpbuf,
&bd,
/* src_fd: */ -1,
/* inbuf: */ packed,
/* len: */ packed_len
);
}
/* read_bunzip can longjmp and end up here with i != 0
* on read data errors! Not trivial */
if (i == 0) {
/* Cannot use xmalloc: will leak bd in NOFORK case! */
outbuf = malloc_or_warn(unpacked_len);
if (outbuf)
read_bunzip(bd, outbuf, unpacked_len);
}
dealloc_bunzip(bd);
return outbuf;
}
#ifdef TESTING
static char *const bunzip_errors[] = {
NULL, "Bad file checksum", "Not bzip data",
"Unexpected input EOF", "Unexpected output EOF", "Data error",
"Out of memory", "Obsolete (pre 0.9.5) bzip format not supported"
};
/* Dumb little test thing, decompress stdin to stdout */
int main(int argc, char **argv)
{
char c;
int i = unpack_bz2_stream(0, 1);
if (i < 0)
fprintf(stderr, "%s\n", bunzip_errors[-i]);
else if (read(STDIN_FILENO, &c, 1))
fprintf(stderr, "Trailing garbage ignored\n");
return -i;
}
#endif

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,312 @@
/* vi: set sw=4 ts=4: */
/*
* uncompress for busybox -- (c) 2002 Robert Griebl
*
* based on the original compress42.c source
* (see disclaimer below)
*/
/* (N)compress42.c - File compression ala IEEE Computer, Mar 1992.
*
* Authors:
* Spencer W. Thomas (decvax!harpo!utah-cs!utah-gr!thomas)
* Jim McKie (decvax!mcvax!jim)
* Steve Davies (decvax!vax135!petsd!peora!srd)
* Ken Turkowski (decvax!decwrl!turtlevax!ken)
* James A. Woods (decvax!ihnp4!ames!jaw)
* Joe Orost (decvax!vax135!petsd!joe)
* Dave Mack (csu@alembic.acs.com)
* Peter Jannesen, Network Communication Systems
* (peter@ncs.nl)
*
* marc@suse.de : a small security fix for a buffer overflow
*
* [... History snipped ...]
*/
#include "libbb.h"
#include "bb_archive.h"
/* Default input buffer size */
#define IBUFSIZ 2048
/* Default output buffer size */
#define OBUFSIZ 2048
/* Defines for third byte of header */
#define BIT_MASK 0x1f /* Mask for 'number of compresssion bits' */
/* Masks 0x20 and 0x40 are free. */
/* I think 0x20 should mean that there is */
/* a fourth header byte (for expansion). */
#define BLOCK_MODE 0x80 /* Block compression if table is full and */
/* compression rate is dropping flush tables */
/* the next two codes should not be changed lightly, as they must not */
/* lie within the contiguous general code space. */
#define FIRST 257 /* first free entry */
#define CLEAR 256 /* table clear output code */
#define INIT_BITS 9 /* initial number of bits/code */
/* machine variants which require cc -Dmachine: pdp11, z8000, DOS */
#define HBITS 17 /* 50% occupancy */
#define HSIZE (1<<HBITS)
#define HMASK (HSIZE-1) /* unused */
#define HPRIME 9941 /* unused */
#define BITS 16
#define BITS_STR "16"
#undef MAXSEG_64K /* unused */
#define MAXCODE(n) (1L << (n))
#define htabof(i) htab[i]
#define codetabof(i) codetab[i]
#define tab_prefixof(i) codetabof(i)
#define tab_suffixof(i) ((unsigned char *)(htab))[i]
#define de_stack ((unsigned char *)&(htab[HSIZE-1]))
#define clear_tab_prefixof() memset(codetab, 0, 256)
/*
* Decompress stdin to stdout. This routine adapts to the codes in the
* file building the "string" table on-the-fly; requiring no table to
* be stored in the compressed file.
*/
IF_DESKTOP(long long) int FAST_FUNC
unpack_Z_stream(transformer_state_t *xstate)
{
IF_DESKTOP(long long total_written = 0;)
IF_DESKTOP(long long) int retval = -1;
unsigned char *stackp;
int finchar;
long oldcode;
long incode;
int inbits;
int posbits;
int outpos;
int insize;
int bitmask;
long free_ent;
long maxcode;
long maxmaxcode;
int n_bits;
int rsize = 0;
unsigned char *inbuf; /* were eating insane amounts of stack - */
unsigned char *outbuf; /* bad for some embedded targets */
unsigned char *htab;
unsigned short *codetab;
/* Hmm, these were statics - why?! */
/* user settable max # bits/code */
int maxbits; /* = BITS; */
/* block compress mode -C compatible with 2.0 */
int block_mode; /* = BLOCK_MODE; */
if (check_signature16(xstate, COMPRESS_MAGIC))
return -1;
inbuf = xzalloc(IBUFSIZ + 64);
outbuf = xzalloc(OBUFSIZ + 2048);
htab = xzalloc(HSIZE); /* wasn't zeroed out before, maybe can xmalloc? */
codetab = xzalloc(HSIZE * sizeof(codetab[0]));
insize = 0;
/* xread isn't good here, we have to return - caller may want
* to do some cleanup (e.g. delete incomplete unpacked file etc) */
if (full_read(xstate->src_fd, inbuf, 1) != 1) {
bb_simple_error_msg("short read");
goto err;
}
maxbits = inbuf[0] & BIT_MASK;
block_mode = inbuf[0] & BLOCK_MODE;
maxmaxcode = MAXCODE(maxbits);
if (maxbits > BITS) {
bb_error_msg("compressed with %d bits, can only handle "
BITS_STR" bits", maxbits);
goto err;
}
n_bits = INIT_BITS;
maxcode = MAXCODE(INIT_BITS) - 1;
bitmask = (1 << INIT_BITS) - 1;
oldcode = -1;
finchar = 0;
outpos = 0;
posbits = 0 << 3;
free_ent = ((block_mode) ? FIRST : 256);
/* As above, initialize the first 256 entries in the table. */
/*clear_tab_prefixof(); - done by xzalloc */
{
int i;
for (i = 255; i >= 0; --i)
tab_suffixof(i) = (unsigned char) i;
}
do {
resetbuf:
{
int i;
int e;
int o;
o = posbits >> 3;
e = insize - o;
for (i = 0; i < e; ++i)
inbuf[i] = inbuf[i + o];
insize = e;
posbits = 0;
}
if (insize < (int) (IBUFSIZ + 64) - IBUFSIZ) {
rsize = safe_read(xstate->src_fd, inbuf + insize, IBUFSIZ);
if (rsize < 0)
bb_simple_error_msg_and_die(bb_msg_read_error);
insize += rsize;
}
inbits = ((rsize > 0) ? (insize - insize % n_bits) << 3 :
(insize << 3) - (n_bits - 1));
while (inbits > posbits) {
long code;
if (free_ent > maxcode) {
posbits =
((posbits - 1) +
((n_bits << 3) -
(posbits - 1 + (n_bits << 3)) % (n_bits << 3)));
++n_bits;
if (n_bits == maxbits) {
maxcode = maxmaxcode;
} else {
maxcode = MAXCODE(n_bits) - 1;
}
bitmask = (1 << n_bits) - 1;
goto resetbuf;
}
{
unsigned char *p = &inbuf[posbits >> 3];
code = ((p[0]
| ((long) (p[1]) << 8)
| ((long) (p[2]) << 16)) >> (posbits & 0x7)) & bitmask;
}
posbits += n_bits;
if (oldcode == -1) {
if (code >= 256)
bb_simple_error_msg_and_die("corrupted data"); /* %ld", code); */
oldcode = code;
finchar = (int) oldcode;
outbuf[outpos++] = (unsigned char) finchar;
continue;
}
if (code == CLEAR && block_mode) {
clear_tab_prefixof();
free_ent = FIRST - 1;
posbits =
((posbits - 1) +
((n_bits << 3) -
(posbits - 1 + (n_bits << 3)) % (n_bits << 3)));
n_bits = INIT_BITS;
maxcode = MAXCODE(INIT_BITS) - 1;
bitmask = (1 << INIT_BITS) - 1;
goto resetbuf;
}
incode = code;
stackp = de_stack;
/* Special case for KwKwK string. */
if (code >= free_ent) {
if (code > free_ent) {
/*
unsigned char *p;
posbits -= n_bits;
p = &inbuf[posbits >> 3];
bb_error_msg
("insize:%d posbits:%d inbuf:%02X %02X %02X %02X %02X (%d)",
insize, posbits, p[-1], p[0], p[1], p[2], p[3],
(posbits & 07));
*/
bb_simple_error_msg("corrupted data");
goto err;
}
*--stackp = (unsigned char) finchar;
code = oldcode;
}
/* Generate output characters in reverse order */
while (code >= 256) {
if (stackp <= &htabof(0))
bb_simple_error_msg_and_die("corrupted data");
*--stackp = tab_suffixof(code);
code = tab_prefixof(code);
}
finchar = tab_suffixof(code);
*--stackp = (unsigned char) finchar;
/* And put them out in forward order */
{
int i;
i = de_stack - stackp;
if (outpos + i >= OBUFSIZ) {
do {
if (i > OBUFSIZ - outpos) {
i = OBUFSIZ - outpos;
}
if (i > 0) {
memcpy(outbuf + outpos, stackp, i);
outpos += i;
}
if (outpos >= OBUFSIZ) {
xtransformer_write(xstate, outbuf, outpos);
IF_DESKTOP(total_written += outpos;)
outpos = 0;
}
stackp += i;
i = de_stack - stackp;
} while (i > 0);
} else {
memcpy(outbuf + outpos, stackp, i);
outpos += i;
}
}
/* Generate the new entry. */
if (free_ent < maxmaxcode) {
tab_prefixof(free_ent) = (unsigned short) oldcode;
tab_suffixof(free_ent) = (unsigned char) finchar;
free_ent++;
}
/* Remember previous code. */
oldcode = incode;
}
} while (rsize > 0);
if (outpos > 0) {
xtransformer_write(xstate, outbuf, outpos);
IF_DESKTOP(total_written += outpos;)
}
retval = IF_DESKTOP(total_written) + 0;
err:
free(inbuf);
free(outbuf);
free(htab);
free(codetab);
return retval;
}

View File

@ -0,0 +1,527 @@
/* vi: set sw=4 ts=4: */
/*
* Small lzma deflate implementation.
* Copyright (C) 2006 Aurelien Jacobs <aurel@gnuage.org>
*
* Based on LzmaDecode.c from the LZMA SDK 4.22 (http://www.7-zip.org/)
* Copyright (C) 1999-2005 Igor Pavlov
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
#if 0
# define dbg(...) bb_error_msg(__VA_ARGS__)
#else
# define dbg(...) ((void)0)
#endif
#if ENABLE_FEATURE_LZMA_FAST
# define speed_inline ALWAYS_INLINE
# define size_inline
#else
# define speed_inline
# define size_inline ALWAYS_INLINE
#endif
typedef struct {
int fd;
uint8_t *ptr;
/* Was keeping rc on stack in unlzma and separately allocating buffer,
* but with "buffer 'attached to' allocated rc" code is smaller: */
/* uint8_t *buffer; */
#define RC_BUFFER ((uint8_t*)(rc+1))
uint8_t *buffer_end;
/* Had provisions for variable buffer, but we don't need it here */
/* int buffer_size; */
#define RC_BUFFER_SIZE 0x10000
uint32_t code;
uint32_t range;
uint32_t bound;
} rc_t;
#define RC_TOP_BITS 24
#define RC_MOVE_BITS 5
#define RC_MODEL_TOTAL_BITS 11
/* Called once in rc_do_normalize() */
static void rc_read(rc_t *rc)
{
int buffer_size = safe_read(rc->fd, RC_BUFFER, RC_BUFFER_SIZE);
//TODO: return -1 instead
//This will make unlzma delete broken unpacked file on unpack errors
if (buffer_size <= 0)
bb_simple_error_msg_and_die("unexpected EOF");
rc->buffer_end = RC_BUFFER + buffer_size;
rc->ptr = RC_BUFFER;
}
/* Called twice, but one callsite is in speed_inline'd rc_is_bit_1() */
static void rc_do_normalize(rc_t *rc)
{
if (rc->ptr >= rc->buffer_end)
rc_read(rc);
rc->range <<= 8;
rc->code = (rc->code << 8) | *rc->ptr++;
}
static ALWAYS_INLINE void rc_normalize(rc_t *rc)
{
if (rc->range < (1 << RC_TOP_BITS)) {
rc_do_normalize(rc);
}
}
/* Called once */
static ALWAYS_INLINE rc_t* rc_init(int fd) /*, int buffer_size) */
{
int i;
rc_t *rc;
rc = xzalloc(sizeof(*rc) + RC_BUFFER_SIZE);
rc->fd = fd;
/* rc->ptr = rc->buffer_end; */
for (i = 0; i < 5; i++) {
rc_do_normalize(rc);
}
rc->range = 0xffffffff;
return rc;
}
/* Called once */
static ALWAYS_INLINE void rc_free(rc_t *rc)
{
free(rc);
}
/* rc_is_bit_1 is called 9 times */
static speed_inline int rc_is_bit_1(rc_t *rc, uint16_t *p)
{
rc_normalize(rc);
rc->bound = *p * (rc->range >> RC_MODEL_TOTAL_BITS);
if (rc->code < rc->bound) {
rc->range = rc->bound;
*p += ((1 << RC_MODEL_TOTAL_BITS) - *p) >> RC_MOVE_BITS;
return 0;
}
rc->range -= rc->bound;
rc->code -= rc->bound;
*p -= *p >> RC_MOVE_BITS;
return 1;
}
/* Called 4 times in unlzma loop */
static ALWAYS_INLINE int rc_get_bit(rc_t *rc, uint16_t *p, int *symbol)
{
int ret = rc_is_bit_1(rc, p);
*symbol = *symbol * 2 + ret;
return ret;
}
/* Called once */
static ALWAYS_INLINE int rc_direct_bit(rc_t *rc)
{
rc_normalize(rc);
rc->range >>= 1;
if (rc->code >= rc->range) {
rc->code -= rc->range;
return 1;
}
return 0;
}
/* Called twice */
static speed_inline void
rc_bit_tree_decode(rc_t *rc, uint16_t *p, int num_levels, int *symbol)
{
int i = num_levels;
*symbol = 1;
while (i--)
rc_get_bit(rc, p + *symbol, symbol);
*symbol -= 1 << num_levels;
}
typedef struct {
uint8_t pos;
uint32_t dict_size;
uint64_t dst_size;
} PACKED lzma_header_t;
/* #defines will force compiler to compute/optimize each one with each usage.
* Have heart and use enum instead. */
enum {
LZMA_BASE_SIZE = 1846,
LZMA_LIT_SIZE = 768,
LZMA_NUM_POS_BITS_MAX = 4,
LZMA_LEN_NUM_LOW_BITS = 3,
LZMA_LEN_NUM_MID_BITS = 3,
LZMA_LEN_NUM_HIGH_BITS = 8,
LZMA_LEN_CHOICE = 0,
LZMA_LEN_CHOICE_2 = (LZMA_LEN_CHOICE + 1),
LZMA_LEN_LOW = (LZMA_LEN_CHOICE_2 + 1),
LZMA_LEN_MID = (LZMA_LEN_LOW \
+ (1 << (LZMA_NUM_POS_BITS_MAX + LZMA_LEN_NUM_LOW_BITS))),
LZMA_LEN_HIGH = (LZMA_LEN_MID \
+ (1 << (LZMA_NUM_POS_BITS_MAX + LZMA_LEN_NUM_MID_BITS))),
LZMA_NUM_LEN_PROBS = (LZMA_LEN_HIGH + (1 << LZMA_LEN_NUM_HIGH_BITS)),
LZMA_NUM_STATES = 12,
LZMA_NUM_LIT_STATES = 7,
LZMA_START_POS_MODEL_INDEX = 4,
LZMA_END_POS_MODEL_INDEX = 14,
LZMA_NUM_FULL_DISTANCES = (1 << (LZMA_END_POS_MODEL_INDEX >> 1)),
LZMA_NUM_POS_SLOT_BITS = 6,
LZMA_NUM_LEN_TO_POS_STATES = 4,
LZMA_NUM_ALIGN_BITS = 4,
LZMA_MATCH_MIN_LEN = 2,
LZMA_IS_MATCH = 0,
LZMA_IS_REP = (LZMA_IS_MATCH + (LZMA_NUM_STATES << LZMA_NUM_POS_BITS_MAX)),
LZMA_IS_REP_G0 = (LZMA_IS_REP + LZMA_NUM_STATES),
LZMA_IS_REP_G1 = (LZMA_IS_REP_G0 + LZMA_NUM_STATES),
LZMA_IS_REP_G2 = (LZMA_IS_REP_G1 + LZMA_NUM_STATES),
LZMA_IS_REP_0_LONG = (LZMA_IS_REP_G2 + LZMA_NUM_STATES),
LZMA_POS_SLOT = (LZMA_IS_REP_0_LONG \
+ (LZMA_NUM_STATES << LZMA_NUM_POS_BITS_MAX)),
LZMA_SPEC_POS = (LZMA_POS_SLOT \
+ (LZMA_NUM_LEN_TO_POS_STATES << LZMA_NUM_POS_SLOT_BITS)),
LZMA_ALIGN = (LZMA_SPEC_POS \
+ LZMA_NUM_FULL_DISTANCES - LZMA_END_POS_MODEL_INDEX),
LZMA_LEN_CODER = (LZMA_ALIGN + (1 << LZMA_NUM_ALIGN_BITS)),
LZMA_REP_LEN_CODER = (LZMA_LEN_CODER + LZMA_NUM_LEN_PROBS),
LZMA_LITERAL = (LZMA_REP_LEN_CODER + LZMA_NUM_LEN_PROBS),
};
IF_DESKTOP(long long) int FAST_FUNC
unpack_lzma_stream(transformer_state_t *xstate)
{
IF_DESKTOP(long long total_written = 0;)
lzma_header_t header;
int lc, pb, lp;
uint32_t pos_state_mask;
uint32_t literal_pos_mask;
uint16_t *p;
rc_t *rc;
int i;
uint8_t *buffer;
uint32_t buffer_size;
uint8_t previous_byte = 0;
size_t buffer_pos = 0, global_pos = 0;
int len = 0;
int state = 0;
uint32_t rep0 = 1, rep1 = 1, rep2 = 1, rep3 = 1;
if (full_read(xstate->src_fd, &header, sizeof(header)) != sizeof(header)
|| header.pos >= (9 * 5 * 5)
) {
bb_simple_error_msg("bad lzma header");
return -1;
}
i = header.pos / 9;
lc = header.pos % 9;
pb = i / 5;
lp = i % 5;
pos_state_mask = (1 << pb) - 1;
literal_pos_mask = (1 << lp) - 1;
/* Example values from linux-3.3.4.tar.lzma:
* dict_size: 64M, dst_size: 2^64-1
*/
header.dict_size = SWAP_LE32(header.dict_size);
header.dst_size = SWAP_LE64(header.dst_size);
if (header.dict_size == 0)
header.dict_size++;
buffer_size = MIN(header.dst_size, header.dict_size);
buffer = xmalloc(buffer_size);
{
int num_probs;
num_probs = LZMA_BASE_SIZE + (LZMA_LIT_SIZE << (lc + lp));
p = xmalloc(num_probs * sizeof(*p));
num_probs += LZMA_LITERAL - LZMA_BASE_SIZE;
for (i = 0; i < num_probs; i++)
p[i] = (1 << RC_MODEL_TOTAL_BITS) >> 1;
}
rc = rc_init(xstate->src_fd); /*, RC_BUFFER_SIZE); */
while (global_pos + buffer_pos < header.dst_size) {
int pos_state = (buffer_pos + global_pos) & pos_state_mask;
uint16_t *prob = p + LZMA_IS_MATCH + (state << LZMA_NUM_POS_BITS_MAX) + pos_state;
if (!rc_is_bit_1(rc, prob)) {
static const char next_state[LZMA_NUM_STATES] =
{ 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
int mi = 1;
prob = (p + LZMA_LITERAL
+ (LZMA_LIT_SIZE * ((((buffer_pos + global_pos) & literal_pos_mask) << lc)
+ (previous_byte >> (8 - lc))
)
)
);
if (state >= LZMA_NUM_LIT_STATES) {
int match_byte;
uint32_t pos;
pos = buffer_pos - rep0;
if ((int32_t)pos < 0) {
pos += header.dict_size;
if ((int32_t)pos < 0)
goto bad;
}
match_byte = buffer[pos];
do {
int bit;
match_byte <<= 1;
bit = match_byte & 0x100;
bit ^= (rc_get_bit(rc, prob + 0x100 + bit + mi, &mi) << 8); /* 0x100 or 0 */
if (bit)
break;
} while (mi < 0x100);
}
while (mi < 0x100) {
rc_get_bit(rc, prob + mi, &mi);
}
state = next_state[state];
previous_byte = (uint8_t) mi;
#if ENABLE_FEATURE_LZMA_FAST
one_byte1:
buffer[buffer_pos++] = previous_byte;
if (buffer_pos == header.dict_size) {
buffer_pos = 0;
global_pos += header.dict_size;
if (transformer_write(xstate, buffer, header.dict_size) != (ssize_t)header.dict_size)
goto bad;
IF_DESKTOP(total_written += header.dict_size;)
}
#else
len = 1;
goto one_byte2;
#endif
} else {
int num_bits;
int offset;
uint16_t *prob2;
#define prob_len prob2
prob2 = p + LZMA_IS_REP + state;
if (!rc_is_bit_1(rc, prob2)) {
rep3 = rep2;
rep2 = rep1;
rep1 = rep0;
state = state < LZMA_NUM_LIT_STATES ? 0 : 3;
prob2 = p + LZMA_LEN_CODER;
} else {
prob2 += LZMA_IS_REP_G0 - LZMA_IS_REP;
if (!rc_is_bit_1(rc, prob2)) {
prob2 = (p + LZMA_IS_REP_0_LONG
+ (state << LZMA_NUM_POS_BITS_MAX)
+ pos_state
);
if (!rc_is_bit_1(rc, prob2)) {
#if ENABLE_FEATURE_LZMA_FAST
uint32_t pos;
state = state < LZMA_NUM_LIT_STATES ? 9 : 11;
pos = buffer_pos - rep0;
if ((int32_t)pos < 0) {
pos += header.dict_size;
/* see unzip_bad_lzma_2.zip: */
if (pos >= buffer_size) {
dbg("%d pos:%d buffer_size:%d", __LINE__, pos, buffer_size);
goto bad;
}
}
previous_byte = buffer[pos];
goto one_byte1;
#else
state = state < LZMA_NUM_LIT_STATES ? 9 : 11;
len = 1;
goto string;
#endif
}
} else {
uint32_t distance;
prob2 += LZMA_IS_REP_G1 - LZMA_IS_REP_G0;
distance = rep1;
if (rc_is_bit_1(rc, prob2)) {
prob2 += LZMA_IS_REP_G2 - LZMA_IS_REP_G1;
distance = rep2;
if (rc_is_bit_1(rc, prob2)) {
distance = rep3;
rep3 = rep2;
}
rep2 = rep1;
}
rep1 = rep0;
rep0 = distance;
}
state = state < LZMA_NUM_LIT_STATES ? 8 : 11;
prob2 = p + LZMA_REP_LEN_CODER;
}
prob_len = prob2 + LZMA_LEN_CHOICE;
num_bits = LZMA_LEN_NUM_LOW_BITS;
if (!rc_is_bit_1(rc, prob_len)) {
prob_len += LZMA_LEN_LOW - LZMA_LEN_CHOICE
+ (pos_state << LZMA_LEN_NUM_LOW_BITS);
offset = 0;
} else {
prob_len += LZMA_LEN_CHOICE_2 - LZMA_LEN_CHOICE;
if (!rc_is_bit_1(rc, prob_len)) {
prob_len += LZMA_LEN_MID - LZMA_LEN_CHOICE_2
+ (pos_state << LZMA_LEN_NUM_MID_BITS);
offset = 1 << LZMA_LEN_NUM_LOW_BITS;
num_bits += LZMA_LEN_NUM_MID_BITS - LZMA_LEN_NUM_LOW_BITS;
} else {
prob_len += LZMA_LEN_HIGH - LZMA_LEN_CHOICE_2;
offset = ((1 << LZMA_LEN_NUM_LOW_BITS)
+ (1 << LZMA_LEN_NUM_MID_BITS));
num_bits += LZMA_LEN_NUM_HIGH_BITS - LZMA_LEN_NUM_LOW_BITS;
}
}
rc_bit_tree_decode(rc, prob_len, num_bits, &len);
len += offset;
if (state < 4) {
int pos_slot;
uint16_t *prob3;
state += LZMA_NUM_LIT_STATES;
prob3 = p + LZMA_POS_SLOT +
((len < LZMA_NUM_LEN_TO_POS_STATES ? len :
LZMA_NUM_LEN_TO_POS_STATES - 1)
<< LZMA_NUM_POS_SLOT_BITS);
rc_bit_tree_decode(rc, prob3,
LZMA_NUM_POS_SLOT_BITS, &pos_slot);
rep0 = pos_slot;
if (pos_slot >= LZMA_START_POS_MODEL_INDEX) {
int i2, mi2, num_bits2 = (pos_slot >> 1) - 1;
rep0 = 2 | (pos_slot & 1);
if (pos_slot < LZMA_END_POS_MODEL_INDEX) {
rep0 <<= num_bits2;
prob3 = p + LZMA_SPEC_POS + rep0 - pos_slot - 1;
} else {
for (; num_bits2 != LZMA_NUM_ALIGN_BITS; num_bits2--)
rep0 = (rep0 << 1) | rc_direct_bit(rc);
rep0 <<= LZMA_NUM_ALIGN_BITS;
// Note: (int32_t)rep0 may be < 0 here
// (I have linux-3.3.4.tar.lzma which has it).
// I moved the check after "++rep0 == 0" check below.
prob3 = p + LZMA_ALIGN;
}
i2 = 1;
mi2 = 1;
while (num_bits2--) {
if (rc_get_bit(rc, prob3 + mi2, &mi2))
rep0 |= i2;
i2 <<= 1;
}
}
rep0++;
if ((int32_t)rep0 <= 0) {
if (rep0 == 0)
break;
dbg("%d rep0:%d", __LINE__, rep0);
goto bad;
}
}
len += LZMA_MATCH_MIN_LEN;
/*
* LZMA SDK has this optimized:
* it precalculates size and copies many bytes
* in a loop with simpler checks, a-la:
* do
* *(dest) = *(dest + ofs);
* while (++dest != lim);
* and
* do {
* buffer[buffer_pos++] = buffer[pos];
* if (++pos == header.dict_size)
* pos = 0;
* } while (--cur_len != 0);
* Our code is slower (more checks per byte copy):
*/
IF_NOT_FEATURE_LZMA_FAST(string:)
do {
uint32_t pos = buffer_pos - rep0;
if ((int32_t)pos < 0) {
pos += header.dict_size;
/* bug 10436 has an example file where this triggers: */
//if ((int32_t)pos < 0)
// goto bad;
/* more stringent test (see unzip_bad_lzma_1.zip): */
if (pos >= buffer_size)
goto bad;
}
previous_byte = buffer[pos];
IF_NOT_FEATURE_LZMA_FAST(one_byte2:)
buffer[buffer_pos++] = previous_byte;
if (buffer_pos == header.dict_size) {
buffer_pos = 0;
global_pos += header.dict_size;
if (transformer_write(xstate, buffer, header.dict_size) != (ssize_t)header.dict_size)
goto bad;
IF_DESKTOP(total_written += header.dict_size;)
}
len--;
} while (len != 0 && buffer_pos < header.dst_size);
/* FIXME: ...........^^^^^
* shouldn't it be "global_pos + buffer_pos < header.dst_size"?
* It probably should, but it is a "do we accidentally
* unpack more bytes than expected?" check - which
* never happens for well-formed compression data...
*/
}
}
{
IF_NOT_DESKTOP(int total_written = 0; /* success */)
IF_DESKTOP(total_written += buffer_pos;)
if (transformer_write(xstate, buffer, buffer_pos) != (ssize_t)buffer_pos) {
bad:
/* One of our users, bbunpack(), expects _us_ to emit
* the error message (since it's the best place to give
* potentially more detailed information).
* Do not fail silently.
*/
bb_simple_error_msg("corrupted data");
total_written = -1; /* failure */
}
rc_free(rc);
free(p);
free(buffer);
return total_written;
}
}

View File

@ -0,0 +1,154 @@
/*
* This file uses XZ Embedded library code which is written
* by Lasse Collin <lasse.collin@tukaani.org>
* and Igor Pavlov <http://7-zip.org/>
*
* See README file in unxz/ directory for more information.
*
* This file is:
* Copyright (C) 2010 Denys Vlasenko <vda.linux@googlemail.com>
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
#define XZ_FUNC FAST_FUNC
#define XZ_EXTERN static
#define XZ_DEC_DYNALLOC
/* Skip check (rather than fail) of unsupported hash functions */
#define XZ_DEC_ANY_CHECK 1
/* We use our own crc32 function */
#define XZ_INTERNAL_CRC32 0
static uint32_t xz_crc32(const uint8_t *buf, size_t size, uint32_t crc)
{
return ~crc32_block_endian0(~crc, buf, size, global_crc32_table);
}
/* We use arch-optimized unaligned fixed-endian accessors.
* They have been moved to libbb (proved to be useful elsewhere as well),
* just check that we have them defined:
*/
#if !defined(get_unaligned_le32) \
|| !defined(get_unaligned_be32) \
|| !defined(put_unaligned_le32) \
|| !defined(put_unaligned_be32)
# error get_unaligned_le32 accessors are not defined
#endif
#include "unxz/xz_dec_bcj.c"
#include "unxz/xz_dec_lzma2.c"
#include "unxz/xz_dec_stream.c"
IF_DESKTOP(long long) int FAST_FUNC
unpack_xz_stream(transformer_state_t *xstate)
{
enum xz_ret xz_result;
struct xz_buf iobuf;
struct xz_dec *state;
unsigned char *membuf;
IF_DESKTOP(long long) int total = 0;
if (!global_crc32_table)
global_crc32_new_table_le();
memset(&iobuf, 0, sizeof(iobuf));
membuf = xmalloc(2 * BUFSIZ);
iobuf.in = membuf;
iobuf.out = membuf + BUFSIZ;
iobuf.out_size = BUFSIZ;
if (!xstate || xstate->signature_skipped) {
/* Preload XZ file signature */
strcpy((char*)membuf, HEADER_MAGIC);
iobuf.in_size = HEADER_MAGIC_SIZE;
} /* else: let xz code read & check it */
/* Limit memory usage to about 64 MiB. */
state = xz_dec_init(XZ_DYNALLOC, 64*1024*1024);
xz_result = X_OK;
while (1) {
if (iobuf.in_pos == iobuf.in_size) {
int rd = safe_read(xstate->src_fd, membuf, BUFSIZ);
if (rd < 0) {
bb_simple_error_msg(bb_msg_read_error);
total = -1;
break;
}
if (rd == 0 && xz_result == XZ_STREAM_END)
break;
iobuf.in_size = rd;
iobuf.in_pos = 0;
}
if (xz_result == XZ_STREAM_END) {
/*
* Try to start decoding next concatenated stream.
* Stream padding must always be a multiple of four
* bytes to preserve four-byte alignment. To keep the
* code slightly smaller, we aren't as strict here as
* the .xz spec requires. We just skip all zero-bytes
* without checking the alignment and thus can accept
* files that aren't valid, e.g. the XZ utils test
* files bad-0pad-empty.xz and bad-0catpad-empty.xz.
*/
do {
if (membuf[iobuf.in_pos] != 0) {
/* There is more data, but is it XZ data?
* Example: dpkg-deb -f busybox_1.30.1-4_amd64.deb
* reads control.tar.xz "control" file
* inside the ar archive, but tar.xz
* extraction code reaches end of xz data,
* reached this code and reads the beginning
* of data.tar.xz's ar header, which isn't xz data,
* and prints "corrupted data".
* The correct solution is to not read
* past nested archive (to simulate EOF).
* This is a workaround:
*/
if (membuf[iobuf.in_pos] != 0xfd) {
/* It's definitely not a xz signature
* (which is 0xfd,"7zXZ",0x00).
*/
goto end;
}
xz_dec_reset(state);
goto do_run;
}
iobuf.in_pos++;
} while (iobuf.in_pos < iobuf.in_size);
}
do_run:
// bb_error_msg(">in pos:%d size:%d out pos:%d size:%d",
// iobuf.in_pos, iobuf.in_size, iobuf.out_pos, iobuf.out_size);
xz_result = xz_dec_run(state, &iobuf);
// bb_error_msg("<in pos:%d size:%d out pos:%d size:%d r:%d",
// iobuf.in_pos, iobuf.in_size, iobuf.out_pos, iobuf.out_size, xz_result);
if (iobuf.out_pos) {
xtransformer_write(xstate, iobuf.out, iobuf.out_pos);
IF_DESKTOP(total += iobuf.out_pos;)
iobuf.out_pos = 0;
}
if (xz_result == XZ_STREAM_END) {
/*
* Can just "break;" here, if not for concatenated
* .xz streams.
* Checking for padding may require buffer
* replenishment. Can't do it here.
*/
continue;
}
if (xz_result != XZ_OK && xz_result != XZ_UNSUPPORTED_CHECK) {
bb_simple_error_msg("corrupted data");
total = -1;
break;
}
}
end:
xz_dec_end(state);
free(membuf);
return total;
}

View File

@ -0,0 +1,16 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright (C) 2002 by Glenn McGrath
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
/* Accept any non-null name, its not really a filter at all */
char FAST_FUNC filter_accept_all(archive_handle_t *archive_handle)
{
if (archive_handle->file_header->name)
return EXIT_SUCCESS;
return EXIT_FAILURE;
}

View File

@ -0,0 +1,18 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright (C) 2002 by Glenn McGrath
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
/*
* Accept names that are in the accept list, ignoring reject list.
*/
char FAST_FUNC filter_accept_list(archive_handle_t *archive_handle)
{
if (find_list_entry(archive_handle->accept, archive_handle->file_header->name))
return EXIT_SUCCESS;
return EXIT_FAILURE;
}

View File

@ -0,0 +1,60 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright (C) 2002 by Glenn McGrath
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
/* Built and used only if ENABLE_DPKG || ENABLE_DPKG_DEB */
/*
* Reassign the subarchive metadata parser based on the filename extension
* e.g. if its a .tar.gz modify archive_handle->sub_archive to process a .tar.gz
* or if its a .tar.bz2 make archive_handle->sub_archive handle that
*/
char FAST_FUNC filter_accept_list_reassign(archive_handle_t *archive_handle)
{
/* Check the file entry is in the accept list */
if (find_list_entry(archive_handle->accept, archive_handle->file_header->name)) {
const char *name_ptr;
/* Find extension */
name_ptr = strrchr(archive_handle->file_header->name, '.');
if (!name_ptr)
return EXIT_FAILURE;
name_ptr++;
/* Modify the subarchive handler based on the extension */
if (strcmp(name_ptr, "tar") == 0) {
archive_handle->dpkg__action_data_subarchive = get_header_tar;
return EXIT_SUCCESS;
}
if (ENABLE_FEATURE_SEAMLESS_GZ
&& strcmp(name_ptr, "gz") == 0
) {
archive_handle->dpkg__action_data_subarchive = get_header_tar_gz;
return EXIT_SUCCESS;
}
if (ENABLE_FEATURE_SEAMLESS_BZ2
&& strcmp(name_ptr, "bz2") == 0
) {
archive_handle->dpkg__action_data_subarchive = get_header_tar_bz2;
return EXIT_SUCCESS;
}
if (ENABLE_FEATURE_SEAMLESS_LZMA
&& strcmp(name_ptr, "lzma") == 0
) {
archive_handle->dpkg__action_data_subarchive = get_header_tar_lzma;
return EXIT_SUCCESS;
}
if (ENABLE_FEATURE_SEAMLESS_XZ
&& strcmp(name_ptr, "xz") == 0
) {
archive_handle->dpkg__action_data_subarchive = get_header_tar_xz;
return EXIT_SUCCESS;
}
}
return EXIT_FAILURE;
}

View File

@ -0,0 +1,37 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright (C) 2002 by Glenn McGrath
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
/*
* Accept names that are in the accept list and not in the reject list
*/
char FAST_FUNC filter_accept_reject_list(archive_handle_t *archive_handle)
{
const char *key;
const llist_t *reject_entry;
const llist_t *accept_entry;
key = archive_handle->file_header->name;
/* If the key is in a reject list fail */
reject_entry = find_list_entry2(archive_handle->reject, key);
if (reject_entry) {
return EXIT_FAILURE;
}
/* Fail if an accept list was specified and the key wasnt in there */
if (archive_handle->accept) {
accept_entry = find_list_entry2(archive_handle->accept, key);
if (!accept_entry) {
return EXIT_FAILURE;
}
}
/* Accepted */
return EXIT_SUCCESS;
}

View File

@ -0,0 +1,53 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright (C) 2002 by Glenn McGrath
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include <fnmatch.h>
#include "libbb.h"
#include "bb_archive.h"
/* Find a string in a shell pattern list */
const llist_t* FAST_FUNC find_list_entry(const llist_t *list, const char *filename)
{
while (list) {
if (fnmatch(list->data, filename, 0) == 0) {
return list;
}
list = list->link;
}
return NULL;
}
/* Same, but compares only path components present in pattern
* (extra trailing path components in filename are assumed to match)
*/
const llist_t* FAST_FUNC find_list_entry2(const llist_t *list, const char *filename)
{
char buf[PATH_MAX];
int pattern_slash_cnt;
const char *c;
char *d;
while (list) {
c = list->data;
pattern_slash_cnt = 0;
while (*c)
if (*c++ == '/') pattern_slash_cnt++;
c = filename;
d = buf;
/* paranoia is better than buffer overflows */
while (*c && d != buf + sizeof(buf)-1) {
if (*c == '/' && --pattern_slash_cnt < 0)
break;
*d++ = *c++;
}
*d = '\0';
if (fnmatch(list->data, buf, 0) == 0) {
return list;
}
list = list->link;
}
return NULL;
}

View File

@ -0,0 +1,146 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright 2001 Glenn McGrath.
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
#include "ar_.h"
/* WARNING: Clobbers str[len], so fields must be read in reverse order! */
static unsigned read_num(char *str, int base, int len)
{
int err;
/* ar fields are fixed length text strings (padded with spaces).
* Ensure bb_strtou doesn't read past the field in case the full
* width is used. */
str[len] = 0;
/* This code works because
* on misformatted numbers bb_strtou returns all-ones */
err = bb_strtou(str, NULL, base);
if (err == -1)
bb_simple_error_msg_and_die("invalid ar header");
return err;
}
char FAST_FUNC get_header_ar(archive_handle_t *archive_handle)
{
file_header_t *typed = archive_handle->file_header;
unsigned size;
union {
char raw[60];
struct ar_header formatted;
} ar;
/* dont use xread as we want to handle the error ourself */
if (read(archive_handle->src_fd, ar.raw, 60) != 60) {
/* End Of File */
return EXIT_FAILURE;
}
/* ar header starts on an even byte (2 byte aligned)
* '\n' is used for padding
*/
if (ar.raw[0] == '\n') {
/* fix up the header, we started reading 1 byte too early */
memmove(ar.raw, &ar.raw[1], 59);
ar.raw[59] = xread_char(archive_handle->src_fd);
archive_handle->offset++;
}
archive_handle->offset += 60;
if (ar.formatted.magic[0] != '`' || ar.formatted.magic[1] != '\n')
bb_simple_error_msg_and_die("invalid ar header");
/*
* Note that the fields MUST be read in reverse order as
* read_num() clobbers the next byte after the field!
* Order is: name, date, uid, gid, mode, size, magic.
*/
typed->size = size = read_num(ar.formatted.size, 10,
sizeof(ar.formatted.size));
/* special filenames have '/' as the first character */
if (ar.formatted.name[0] == '/') {
if (ar.formatted.name[1] == ' ') {
/* This is the index of symbols in the file for compilers */
data_skip(archive_handle);
archive_handle->offset += size;
return get_header_ar(archive_handle); /* Return next header */
}
#if ENABLE_FEATURE_AR_LONG_FILENAMES
if (ar.formatted.name[1] == '/') {
/* If the second char is a '/' then this entries data section
* stores long filename for multiple entries, they are stored
* in static variable long_names for use in future entries
*/
archive_handle->ar__long_name_size = size;
free(archive_handle->ar__long_names);
archive_handle->ar__long_names = xzalloc(size + 1);
xread(archive_handle->src_fd, archive_handle->ar__long_names, size);
archive_handle->offset += size;
/* Return next header */
return get_header_ar(archive_handle);
}
#else
bb_simple_error_msg_and_die("long filenames not supported");
#endif
}
/* Only size is always present, the rest may be missing in
* long filename pseudo file. Thus we decode the rest
* after dealing with long filename pseudo file.
*
* GNU binutils in deterministic mode hard codes mode to 0644 (NOT
* 0100644). AR archives can only contain files, so force file
* mode.
*/
typed->mode = read_num(ar.formatted.mode, 8, sizeof(ar.formatted.mode)) | S_IFREG;
typed->gid = read_num(ar.formatted.gid, 10, sizeof(ar.formatted.gid));
typed->uid = read_num(ar.formatted.uid, 10, sizeof(ar.formatted.uid));
typed->mtime = read_num(ar.formatted.date, 10, sizeof(ar.formatted.date));
#if ENABLE_FEATURE_AR_LONG_FILENAMES
if (ar.formatted.name[0] == '/') {
unsigned long_offset;
/* The number after the '/' indicates the offset in the ar data section
* (saved in ar__long_names) that contains the real filename */
long_offset = read_num(&ar.formatted.name[1], 10,
sizeof(ar.formatted.name) - 1);
if (long_offset >= archive_handle->ar__long_name_size) {
bb_simple_error_msg_and_die("can't resolve long filename");
}
typed->name = xstrdup(archive_handle->ar__long_names + long_offset);
} else
#endif
{
/* short filenames */
typed->name = xstrndup(ar.formatted.name, 16);
}
typed->name[strcspn(typed->name, " /")] = '\0';
if (archive_handle->filter(archive_handle) == EXIT_SUCCESS) {
archive_handle->action_header(typed);
#if ENABLE_DPKG || ENABLE_DPKG_DEB
if (archive_handle->dpkg__sub_archive) {
struct archive_handle_t *sa = archive_handle->dpkg__sub_archive;
while (archive_handle->dpkg__action_data_subarchive(sa) == EXIT_SUCCESS)
continue;
create_links_from_list(sa->link_placeholders);
} else
#endif
archive_handle->action_data(archive_handle);
} else {
data_skip(archive_handle);
}
archive_handle->offset += typed->size;
/* Set the file pointer to the correct spot, we may have been reading a compressed file */
lseek(archive_handle->src_fd, archive_handle->offset, SEEK_SET);
return EXIT_SUCCESS;
}

View File

@ -0,0 +1,192 @@
/* vi: set sw=4 ts=4: */
/*
* Copyright 2002 Laurence Anderson
*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
typedef struct hardlinks_t {
struct hardlinks_t *next;
int inode; /* TODO: must match maj/min too! */
int mode ;
int mtime; /* These three are useful only in corner case */
int uid ; /* of hardlinks with zero size body */
int gid ;
char name[1];
} hardlinks_t;
char FAST_FUNC get_header_cpio(archive_handle_t *archive_handle)
{
file_header_t *file_header = archive_handle->file_header;
char cpio_header[111];
int namesize;
int major, minor, nlink, mode, inode;
unsigned size, uid, gid, mtime;
/* There can be padding before archive header */
data_align(archive_handle, 4);
size = full_read(archive_handle->src_fd, cpio_header, 110);
if (size == 0) {
goto create_hardlinks;
}
if (size != 110) {
bb_simple_error_msg_and_die("short read");
}
archive_handle->offset += 110;
if (!is_prefixed_with(&cpio_header[0], "07070")
|| (cpio_header[5] != '1' && cpio_header[5] != '2')
) {
bb_simple_error_msg_and_die("unsupported cpio format, use newc or crc");
}
cpio_header[110] = '\0'; /* sscanf may call strlen which may break without this */
if (sscanf(cpio_header + 6,
"%8x" "%8x" "%8x" "%8x"
"%8x" "%8x" "%8x" /*maj,min:*/ "%*16c"
/*rmaj,rmin:*/"%8x" "%8x" "%8x" /*chksum: "%*8c"*/,
&inode, &mode, &uid, &gid,
&nlink, &mtime, &size,
&major, &minor, &namesize) != 10)
bb_simple_error_msg_and_die("damaged cpio file");
file_header->mode = mode;
/* "cpio -R USER:GRP" support: */
if (archive_handle->cpio__owner.uid != (uid_t)-1L)
uid = archive_handle->cpio__owner.uid;
if (archive_handle->cpio__owner.gid != (gid_t)-1L)
gid = archive_handle->cpio__owner.gid;
file_header->uid = uid;
file_header->gid = gid;
file_header->mtime = mtime;
file_header->size = size;
namesize &= 0x1fff; /* paranoia: limit names to 8k chars */
file_header->name = xzalloc(namesize + 1);
/* Read in filename */
xread(archive_handle->src_fd, file_header->name, namesize);
if (file_header->name[0] == '/') {
/* Testcase: echo /etc/hosts | cpio -pvd /tmp
* Without this code, it tries to unpack /etc/hosts
* into "/etc/hosts", not "etc/hosts".
*/
char *p = file_header->name;
do p++; while (*p == '/');
overlapping_strcpy(file_header->name, p);
}
archive_handle->offset += namesize;
/* Update offset amount and skip padding before file contents */
data_align(archive_handle, 4);
if (strcmp(file_header->name, cpio_TRAILER) == 0) {
/* Always round up. ">> 9" divides by 512 */
archive_handle->cpio__blocks = (uoff_t)(archive_handle->offset + 511) >> 9;
goto create_hardlinks;
}
file_header->link_target = NULL;
if (S_ISLNK(file_header->mode)) {
file_header->size &= 0x1fff; /* paranoia: limit names to 8k chars */
file_header->link_target = xzalloc(file_header->size + 1);
xread(archive_handle->src_fd, file_header->link_target, file_header->size);
archive_handle->offset += file_header->size;
file_header->size = 0; /* Stop possible seeks in future */
}
// TODO: data_extract_all can't deal with hardlinks to non-files...
// when fixed, change S_ISREG to !S_ISDIR here
if (nlink > 1 && S_ISREG(file_header->mode)) {
hardlinks_t *new = xmalloc(sizeof(*new) + namesize);
new->inode = inode;
new->mode = mode ;
new->mtime = mtime;
new->uid = uid ;
new->gid = gid ;
strcpy(new->name, file_header->name);
/* Put file on a linked list for later */
if (size == 0) {
new->next = archive_handle->cpio__hardlinks_to_create;
archive_handle->cpio__hardlinks_to_create = new;
return EXIT_SUCCESS; /* Skip this one */
/* TODO: this breaks cpio -t (it does not show hardlinks) */
}
new->next = archive_handle->cpio__created_hardlinks;
archive_handle->cpio__created_hardlinks = new;
}
file_header->device = makedev(major, minor);
if (archive_handle->filter(archive_handle) == EXIT_SUCCESS) {
archive_handle->action_data(archive_handle);
//TODO: run "echo /etc/hosts | cpio -pv /tmp" twice. On 2nd run:
//cpio: etc/hosts not created: newer or same age file exists
//etc/hosts <-- should NOT show it
//2 blocks <-- should say "0 blocks"
archive_handle->action_header(file_header);
} else {
data_skip(archive_handle);
}
archive_handle->offset += file_header->size;
free(file_header->link_target);
free(file_header->name);
file_header->link_target = NULL;
file_header->name = NULL;
return EXIT_SUCCESS;
create_hardlinks:
free(file_header->link_target);
free(file_header->name);
while (archive_handle->cpio__hardlinks_to_create) {
hardlinks_t *cur;
hardlinks_t *make_me = archive_handle->cpio__hardlinks_to_create;
archive_handle->cpio__hardlinks_to_create = make_me->next;
memset(file_header, 0, sizeof(*file_header));
file_header->mtime = make_me->mtime;
file_header->name = make_me->name;
file_header->mode = make_me->mode;
file_header->uid = make_me->uid;
file_header->gid = make_me->gid;
/*file_header->size = 0;*/
/*file_header->link_target = NULL;*/
/* Try to find a file we are hardlinked to */
cur = archive_handle->cpio__created_hardlinks;
while (cur) {
/* TODO: must match maj/min too! */
if (cur->inode == make_me->inode) {
file_header->link_target = cur->name;
/* link_target != NULL, size = 0: "I am a hardlink" */
if (archive_handle->filter(archive_handle) == EXIT_SUCCESS)
archive_handle->action_data(archive_handle);
free(make_me);
goto next_link;
}
cur = cur->next;
}
/* Oops... no file with such inode was created... do it now
* (happens when hardlinked files are empty (zero length)) */
if (archive_handle->filter(archive_handle) == EXIT_SUCCESS)
archive_handle->action_data(archive_handle);
/* Move to the list of created hardlinked files */
make_me->next = archive_handle->cpio__created_hardlinks;
archive_handle->cpio__created_hardlinks = make_me;
next_link: ;
}
while (archive_handle->cpio__created_hardlinks) {
hardlinks_t *p = archive_handle->cpio__created_hardlinks;
archive_handle->cpio__created_hardlinks = p->next;
free(p);
}
return EXIT_FAILURE; /* "No more files to process" */
}

View File

@ -0,0 +1,491 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*
* FIXME:
* In privileged mode if uname and gname map to a uid and gid then use the
* mapped value instead of the uid/gid values in tar header
*
* References:
* GNU tar and star man pages,
* Opengroup's ustar interchange format,
* http://www.opengroup.org/onlinepubs/007904975/utilities/pax.html
*/
#include "libbb.h"
#include "bb_archive.h"
typedef uint32_t aliased_uint32_t FIX_ALIASING;
typedef off_t aliased_off_t FIX_ALIASING;
/* NB: _DESTROYS_ str[len] character! */
static unsigned long long getOctal(char *str, int len)
{
unsigned long long v;
char *end;
/* NB: leading spaces are allowed. Using strtoull to handle that.
* The downside is that we accept e.g. "-123" too :(
*/
str[len] = '\0';
v = strtoull(str, &end, 8);
/* std: "Each numeric field is terminated by one or more
* <space> or NUL characters". We must support ' '! */
if (*end != '\0' && *end != ' ') {
int8_t first = str[0];
if (!(first & 0x80))
bb_simple_error_msg_and_die("corrupted octal value in tar header");
/*
* GNU tar uses "base-256 encoding" for very large numbers.
* Encoding is binary, with highest bit always set as a marker
* and sign in next-highest bit:
* 80 00 .. 00 - zero
* bf ff .. ff - largest positive number
* ff ff .. ff - minus 1
* c0 00 .. 00 - smallest negative number
*
* Example of tar file with 8914993153 (0x213600001) byte file.
* Field starts at offset 7c:
* 00070 30 30 30 00 30 30 30 30 30 30 30 00 80 00 00 00 |000.0000000.....|
* 00080 00 00 00 02 13 60 00 01 31 31 31 32 30 33 33 36 |.....`..11120336|
*
* NB: tarballs with NEGATIVE unix times encoded that way were seen!
*/
/* Sign-extend 7bit 'first' to 64bit 'v' (that is, using 6th bit as sign): */
first <<= 1;
first >>= 1; /* now 7th bit = 6th bit */
v = first; /* sign-extend 8 bits to 64 */
while (--len != 0)
v = (v << 8) + (uint8_t) *++str;
}
return v;
}
#define GET_OCTAL(a) getOctal((a), sizeof(a))
#define TAR_EXTD (ENABLE_FEATURE_TAR_GNU_EXTENSIONS || ENABLE_FEATURE_TAR_SELINUX)
#if !TAR_EXTD
#define process_pax_hdr(archive_handle, sz, global) \
process_pax_hdr(archive_handle, sz)
#endif
/* "global" is 0 or 1 */
static void process_pax_hdr(archive_handle_t *archive_handle, unsigned sz, int global)
{
#if !TAR_EXTD
unsigned blk_sz = (sz + 511) & (~511);
seek_by_read(archive_handle->src_fd, blk_sz);
#else
unsigned blk_sz = (sz + 511) & (~511);
char *buf, *p;
p = buf = xmalloc(blk_sz + 1);
xread(archive_handle->src_fd, buf, blk_sz);
archive_handle->offset += blk_sz;
/* prevent bb_strtou from running off the buffer */
buf[sz] = '\0';
while (sz != 0) {
char *end, *value;
unsigned len;
/* Every record has this format: "LEN NAME=VALUE\n" */
len = bb_strtou(p, &end, 10);
/* expect errno to be EINVAL, because the character
* following the digits should be a space
*/
p += len;
sz -= len;
if (
/** (int)sz < 0 - not good enough for huge malicious VALUE of 2^32-1 */
(int)(sz|len) < 0 /* this works */
|| len == 0
|| errno != EINVAL
|| *end != ' '
) {
bb_simple_error_msg("malformed extended header, skipped");
// More verbose version:
//bb_error_msg("malformed extended header at %"OFF_FMT"d, skipped",
// archive_handle->offset - (sz + len));
break;
}
/* overwrite the terminating newline with NUL
* (we do not bother to check that it *was* a newline)
*/
p[-1] = '\0';
value = end + 1;
# if ENABLE_FEATURE_TAR_GNU_EXTENSIONS
if (!global) {
if (is_prefixed_with(value, "path=")) {
value += sizeof("path=") - 1;
free(archive_handle->tar__longname);
archive_handle->tar__longname = xstrdup(value);
continue;
}
if (is_prefixed_with(value, "linkpath=")) {
value += sizeof("linkpath=") - 1;
free(archive_handle->tar__linkname);
archive_handle->tar__linkname = xstrdup(value);
continue;
}
}
# endif
# if ENABLE_FEATURE_TAR_SELINUX
/* Scan for SELinux contexts, via "RHT.security.selinux" keyword.
* This is what Red Hat's patched version of tar uses.
*/
# define SELINUX_CONTEXT_KEYWORD "RHT.security.selinux"
if (is_prefixed_with(value, SELINUX_CONTEXT_KEYWORD"=")) {
value += sizeof(SELINUX_CONTEXT_KEYWORD"=") - 1;
free(archive_handle->tar__sctx[global]);
archive_handle->tar__sctx[global] = xstrdup(value);
continue;
}
# endif
}
free(buf);
#endif
}
#if ENABLE_FEATURE_TAR_GNU_EXTENSIONS
static void die_if_bad_fnamesize(off_t sz)
{
if ((uoff_t)sz > 0xfff) /* more than 4k?! no funny business please */
bb_simple_error_msg_and_die("bad archive");
}
#endif
char FAST_FUNC get_header_tar(archive_handle_t *archive_handle)
{
file_header_t *file_header = archive_handle->file_header;
struct tar_header_t tar;
char *cp;
int tar_typeflag; /* can be "char", "int" seems give smaller code */
int i, sum_u, sum;
#if ENABLE_FEATURE_TAR_OLDSUN_COMPATIBILITY
int sum_s;
#endif
int parse_names;
/* Our "private data" */
#if ENABLE_FEATURE_TAR_GNU_EXTENSIONS
# define p_longname (archive_handle->tar__longname)
# define p_linkname (archive_handle->tar__linkname)
#else
# define p_longname 0
# define p_linkname 0
#endif
#if ENABLE_FEATURE_TAR_GNU_EXTENSIONS || ENABLE_FEATURE_TAR_SELINUX
again:
#endif
/* Align header */
data_align(archive_handle, 512);
again_after_align:
#if ENABLE_DESKTOP || ENABLE_FEATURE_TAR_AUTODETECT
/* to prevent misdetection of bz2 sig */
*(aliased_uint32_t*)&tar = 0;
i = full_read(archive_handle->src_fd, &tar, 512);
/* If GNU tar sees EOF in above read, it says:
* "tar: A lone zero block at N", where N = kilobyte
* where EOF was met (not EOF block, actual EOF!),
* and exits with EXIT_SUCCESS.
* We will mimic exit(EXIT_SUCCESS), although we will not mimic
* the message and we don't check whether we indeed
* saw zero block directly before this. */
if (i == 0) {
/* GNU tar 1.29 will be silent if tar archive ends abruptly
* (if there are no zero blocks at all, and last read returns zero,
* not short read 0 < len < 512). Complain only if
* the very first read fails. Grrr.
*/
if (archive_handle->offset == 0)
bb_simple_error_msg("short read");
/* this merely signals end of archive, not exit(1): */
return EXIT_FAILURE;
}
if (i != 512) {
IF_FEATURE_TAR_AUTODETECT(goto autodetect;)
bb_simple_error_msg_and_die("short read");
}
#else
i = 512;
xread(archive_handle->src_fd, &tar, i);
#endif
archive_handle->offset += i;
/* If there is no filename its an empty header */
if (tar.name[0] == 0 && tar.prefix[0] == 0
/* Have seen a tar archive with pax 'x' header supplying UTF8 filename,
* with actual file having all name fields NUL-filled. Check this: */
&& !p_longname
) {
if (archive_handle->tar__end) {
/* Second consecutive empty header - end of archive.
* Read until the end to empty the pipe from gz or bz2
*/
while (full_read(archive_handle->src_fd, &tar, 512) == 512)
continue;
return EXIT_FAILURE; /* "end of archive" */
}
archive_handle->tar__end = 1;
return EXIT_SUCCESS; /* "decoded one header" */
}
archive_handle->tar__end = 0;
/* Check header has valid magic, "ustar" is for the proper tar,
* five NULs are for the old tar format */
if (!is_prefixed_with(tar.magic, "ustar")
&& (!ENABLE_FEATURE_TAR_OLDGNU_COMPATIBILITY
|| memcmp(tar.magic, "\0\0\0\0", 5) != 0)
) {
#if ENABLE_FEATURE_TAR_AUTODETECT
autodetect:
/* Two different causes for lseek() != 0:
* unseekable fd (would like to support that too, but...),
* or not first block (false positive, it's not .gz/.bz2!) */
if (lseek(archive_handle->src_fd, -i, SEEK_CUR) != 0)
goto err;
if (setup_unzip_on_fd(archive_handle->src_fd, /*fail_if_not_compressed:*/ 0) != 0)
err:
bb_simple_error_msg_and_die("invalid tar magic");
archive_handle->offset = 0;
goto again_after_align;
#endif
bb_simple_error_msg_and_die("invalid tar magic");
}
/* Do checksum on headers.
* POSIX says that checksum is done on unsigned bytes, but
* Sun and HP-UX gets it wrong... more details in
* GNU tar source. */
sum_u = ' ' * sizeof(tar.chksum);
#if ENABLE_FEATURE_TAR_OLDSUN_COMPATIBILITY
sum_s = sum_u;
#endif
for (i = 0; i < 148; i++) {
sum_u += ((unsigned char*)&tar)[i];
#if ENABLE_FEATURE_TAR_OLDSUN_COMPATIBILITY
sum_s += ((signed char*)&tar)[i];
#endif
}
for (i = 156; i < 512; i++) {
sum_u += ((unsigned char*)&tar)[i];
#if ENABLE_FEATURE_TAR_OLDSUN_COMPATIBILITY
sum_s += ((signed char*)&tar)[i];
#endif
}
/* Most tarfiles have tar.chksum NUL or space terminated, but
* github.com decided to be "special" and have unterminated field:
* 0090: 30343300 30303031 33323731 30000000 |043.000132710...|
* ^^^^^^^^|
* Need to use GET_OCTAL. This overwrites tar.typeflag ---+
* (the '0' char immediately after chksum in example above) with NUL.
*/
tar_typeflag = (uint8_t)tar.typeflag; /* save it */
sum = GET_OCTAL(tar.chksum);
if (sum_u != sum
IF_FEATURE_TAR_OLDSUN_COMPATIBILITY(&& sum_s != sum)
) {
bb_simple_error_msg_and_die("invalid tar header checksum");
}
/* GET_OCTAL trashes subsequent field, therefore we call it
* on fields in reverse order */
if (tar.devmajor[0]) {
char t = tar.prefix[0];
/* we trash prefix[0] here, but we DO need it later! */
unsigned minor = GET_OCTAL(tar.devminor);
unsigned major = GET_OCTAL(tar.devmajor);
file_header->device = makedev(major, minor);
tar.prefix[0] = t;
}
/* 0 is reserved for high perf file, treat as normal file */
if (tar_typeflag == '\0') tar_typeflag = '0';
parse_names = (tar_typeflag >= '0' && tar_typeflag <= '7');
file_header->link_target = NULL;
if (!p_linkname && parse_names && tar.linkname[0]) {
file_header->link_target = xstrndup(tar.linkname, sizeof(tar.linkname));
/* FIXME: what if we have non-link object with link_target? */
/* Will link_target be free()ed? */
}
#if ENABLE_FEATURE_TAR_UNAME_GNAME
file_header->tar__uname = tar.uname[0] ? xstrndup(tar.uname, sizeof(tar.uname)) : NULL;
file_header->tar__gname = tar.gname[0] ? xstrndup(tar.gname, sizeof(tar.gname)) : NULL;
#endif
file_header->mtime = GET_OCTAL(tar.mtime);
file_header->size = GET_OCTAL(tar.size);
file_header->gid = GET_OCTAL(tar.gid);
file_header->uid = GET_OCTAL(tar.uid);
/* Set bits 0-11 of the files mode */
file_header->mode = 07777 & GET_OCTAL(tar.mode);
file_header->name = NULL;
if (!p_longname && parse_names) {
/* we trash mode[0] here, it's ok */
//tar.name[sizeof(tar.name)] = '\0'; - gcc 4.3.0 would complain
tar.mode[0] = '\0';
if (tar.prefix[0]) {
/* and padding[0] */
//tar.prefix[sizeof(tar.prefix)] = '\0'; - gcc 4.3.0 would complain
tar.padding[0] = '\0';
file_header->name = concat_path_file(tar.prefix, tar.name);
} else
file_header->name = xstrdup(tar.name);
}
switch (tar_typeflag) {
case '1': /* hardlink */
/* we mark hardlinks as regular files with zero size and a link name */
file_header->mode |= S_IFREG;
/* on size of link fields from star(4)
* ... For tar archives written by pre POSIX.1-1988
* implementations, the size field usually contains the size of
* the file and needs to be ignored as no data may follow this
* header type. For POSIX.1-1988 compliant archives, the size
* field needs to be 0. For POSIX.1-2001 compliant archives,
* the size field may be non zero, indicating that file data is
* included in the archive.
* i.e; always assume this is zero for safety.
*/
goto size0;
case '7':
/* case 0: */
case '0':
#if ENABLE_FEATURE_TAR_OLDGNU_COMPATIBILITY
if (file_header->name && last_char_is(file_header->name, '/')) {
goto set_dir;
}
#endif
file_header->mode |= S_IFREG;
break;
case '2':
file_header->mode |= S_IFLNK;
/* have seen tarballs with size field containing
* the size of the link target's name */
size0:
file_header->size = 0;
break;
case '3':
file_header->mode |= S_IFCHR;
goto size0; /* paranoia */
case '4':
file_header->mode |= S_IFBLK;
goto size0;
case '5':
IF_FEATURE_TAR_OLDGNU_COMPATIBILITY(set_dir:)
file_header->mode |= S_IFDIR;
goto size0;
case '6':
file_header->mode |= S_IFIFO;
goto size0;
case 'g': /* pax global header */
case 'x': { /* pax extended header */
if ((uoff_t)file_header->size > 0xfffff) /* paranoia */
goto skip_ext_hdr;
process_pax_hdr(archive_handle, file_header->size, (tar_typeflag == 'g'));
goto again_after_align;
#if ENABLE_FEATURE_TAR_GNU_EXTENSIONS
/* See http://www.gnu.org/software/tar/manual/html_node/Extensions.html */
case 'L':
/* free: paranoia: tar with several consecutive longnames */
free(p_longname);
/* For paranoia reasons we allocate extra NUL char */
die_if_bad_fnamesize(file_header->size);
p_longname = xzalloc(file_header->size + 1);
/* We read ASCIZ string, including NUL */
xread(archive_handle->src_fd, p_longname, file_header->size);
archive_handle->offset += file_header->size;
/* return get_header_tar(archive_handle); */
/* gcc 4.1.1 didn't optimize it into jump */
/* so we will do it ourself, this also saves stack */
goto again;
case 'K':
free(p_linkname);
die_if_bad_fnamesize(file_header->size);
p_linkname = xzalloc(file_header->size + 1);
xread(archive_handle->src_fd, p_linkname, file_header->size);
archive_handle->offset += file_header->size;
/* return get_header_tar(archive_handle); */
goto again;
/*
* case 'S': // Sparse file
* Was seen in the wild. Not supported (yet?).
* See https://www.gnu.org/software/tar/manual/html_section/tar_92.html
* for the format. (An "Old GNU Format" was seen, not PAX formats).
*/
// case 'D': /* GNU dump dir */
// case 'M': /* Continuation of multi volume archive */
// case 'N': /* Old GNU for names > 100 characters */
case 'V': /* Volume header */
; /* Fall through to skip it */
#endif
}
skip_ext_hdr:
{
off_t sz;
bb_error_msg("warning: skipping header '%c'", tar_typeflag);
sz = (file_header->size + 511) & ~(off_t)511;
archive_handle->offset += sz;
sz >>= 9; /* sz /= 512 but w/o contortions for signed div */
while (sz--)
xread(archive_handle->src_fd, &tar, 512);
/* return get_header_tar(archive_handle); */
goto again_after_align;
}
default:
bb_error_msg_and_die("unknown typeflag: 0x%x", tar_typeflag);
}
#if ENABLE_FEATURE_TAR_GNU_EXTENSIONS
if (p_longname) {
file_header->name = p_longname;
p_longname = NULL;
}
if (p_linkname) {
file_header->link_target = p_linkname;
p_linkname = NULL;
}
#endif
/* Everything up to and including last ".." component is stripped */
overlapping_strcpy(file_header->name, strip_unsafe_prefix(file_header->name));
//TODO: do the same for file_header->link_target?
/* Strip trailing '/' in directories */
/* Must be done after mode is set as '/' is used to check if it's a directory */
cp = last_char_is(file_header->name, '/');
if (archive_handle->filter(archive_handle) == EXIT_SUCCESS) {
archive_handle->action_header(/*archive_handle->*/ file_header);
/* Note that we kill the '/' only after action_header() */
/* (like GNU tar 1.15.1: verbose mode outputs "dir/dir/") */
if (cp)
*cp = '\0';
archive_handle->action_data(archive_handle);
if (archive_handle->accept || archive_handle->reject
|| (archive_handle->ah_flags & ARCHIVE_REMEMBER_NAMES)
) {
llist_add_to(&archive_handle->passed, file_header->name);
} else /* Caller isn't interested in list of unpacked files */
free(file_header->name);
} else {
data_skip(archive_handle);
free(file_header->name);
}
archive_handle->offset += file_header->size;
free(file_header->link_target);
/* Do not free(file_header->name)!
* It might be inserted in archive_handle->passed - see above */
#if ENABLE_FEATURE_TAR_UNAME_GNAME
free(file_header->tar__uname);
free(file_header->tar__gname);
#endif
return EXIT_SUCCESS; /* "decoded one header" */
}

View File

@ -0,0 +1,20 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
char FAST_FUNC get_header_tar_bz2(archive_handle_t *archive_handle)
{
/* Can't lseek over pipes */
archive_handle->seek = seek_by_read;
fork_transformer_with_sig(archive_handle->src_fd, unpack_bz2_stream, "bunzip2");
archive_handle->offset = 0;
while (get_header_tar(archive_handle) == EXIT_SUCCESS)
continue;
/* Can only do one file at a time */
return EXIT_FAILURE;
}

View File

@ -0,0 +1,20 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
char FAST_FUNC get_header_tar_gz(archive_handle_t *archive_handle)
{
/* Can't lseek over pipes */
archive_handle->seek = seek_by_read;
fork_transformer_with_sig(archive_handle->src_fd, unpack_gz_stream, "gunzip");
archive_handle->offset = 0;
while (get_header_tar(archive_handle) == EXIT_SUCCESS)
continue;
/* Can only do one file at a time */
return EXIT_FAILURE;
}

View File

@ -0,0 +1,23 @@
/* vi: set sw=4 ts=4: */
/*
* Small lzma deflate implementation.
* Copyright (C) 2006 Aurelien Jacobs <aurel@gnuage.org>
*
* Licensed under GPLv2, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
char FAST_FUNC get_header_tar_lzma(archive_handle_t *archive_handle)
{
/* Can't lseek over pipes */
archive_handle->seek = seek_by_read;
fork_transformer_with_sig(archive_handle->src_fd, unpack_lzma_stream, "unlzma");
archive_handle->offset = 0;
while (get_header_tar(archive_handle) == EXIT_SUCCESS)
continue;
/* Can only do one file at a time */
return EXIT_FAILURE;
}

View File

@ -0,0 +1,20 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
char FAST_FUNC get_header_tar_xz(archive_handle_t *archive_handle)
{
/* Can't lseek over pipes */
archive_handle->seek = seek_by_read;
fork_transformer_with_sig(archive_handle->src_fd, unpack_xz_stream, "unxz");
archive_handle->offset = 0;
while (get_header_tar(archive_handle) == EXIT_SUCCESS)
continue;
/* Can only do one file at a time */
return EXIT_FAILURE;
}

View File

@ -0,0 +1,12 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC header_list(const file_header_t *file_header)
{
//TODO: cpio -vp DIR should output "DIR/NAME", not just "NAME" */
puts(file_header->name);
}

View File

@ -0,0 +1,10 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC header_skip(const file_header_t *file_header UNUSED_PARAM)
{
}

View File

@ -0,0 +1,69 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC header_verbose_list(const file_header_t *file_header)
{
struct tm tm_time;
struct tm *ptm = &tm_time; //localtime(&file_header->mtime);
char modestr[12];
#if ENABLE_FEATURE_TAR_UNAME_GNAME
char uid[sizeof(int)*3 + 2];
/*char gid[sizeof(int)*3 + 2];*/
char *user;
char *group;
localtime_r(&file_header->mtime, ptm);
user = file_header->tar__uname;
if (user == NULL) {
sprintf(uid, "%u", (unsigned)file_header->uid);
user = uid;
}
group = file_header->tar__gname;
if (group == NULL) {
/*sprintf(gid, "%u", (unsigned)file_header->gid);*/
group = utoa(file_header->gid);
}
printf("%s %s/%s %9"OFF_FMT"u %4u-%02u-%02u %02u:%02u:%02u %s",
bb_mode_string(modestr, file_header->mode),
user,
group,
file_header->size,
1900 + ptm->tm_year,
1 + ptm->tm_mon,
ptm->tm_mday,
ptm->tm_hour,
ptm->tm_min,
ptm->tm_sec,
file_header->name);
#else /* !FEATURE_TAR_UNAME_GNAME */
localtime_r(&file_header->mtime, ptm);
printf("%s %u/%u %9"OFF_FMT"u %4u-%02u-%02u %02u:%02u:%02u %s",
bb_mode_string(modestr, file_header->mode),
(unsigned)file_header->uid,
(unsigned)file_header->gid,
file_header->size,
1900 + ptm->tm_year,
1 + ptm->tm_mon,
ptm->tm_mday,
ptm->tm_hour,
ptm->tm_min,
ptm->tm_sec,
file_header->name);
#endif /* FEATURE_TAR_UNAME_GNAME */
/* NB: GNU tar shows "->" for symlinks and "link to" for hardlinks */
if (file_header->link_target) {
printf(" -> %s", file_header->link_target);
}
bb_putchar('\n');
}

View File

@ -0,0 +1,25 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
archive_handle_t* FAST_FUNC init_handle(void)
{
archive_handle_t *archive_handle;
/* Initialize default values */
archive_handle = xzalloc(sizeof(archive_handle_t));
archive_handle->file_header = xzalloc(sizeof(file_header_t));
archive_handle->action_header = header_skip;
archive_handle->action_data = data_skip;
archive_handle->filter = filter_accept_all;
archive_handle->seek = seek_by_jump;
#if ENABLE_CPIO || ENABLE_RPM2CPIO || ENABLE_RPM
archive_handle->cpio__owner.uid = (uid_t)-1L;
archive_handle->cpio__owner.gid = (gid_t)-1L;
#endif
return archive_handle;
}

View File

@ -0,0 +1,95 @@
/*
This file is part of the LZO real-time data compression library.
Copyright (C) 1996..2008 Markus Franz Xaver Johannes Oberhumer
All Rights Reserved.
Markus F.X.J. Oberhumer <markus@oberhumer.com>
http://www.oberhumer.com/opensource/lzo/
The LZO library is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.
The LZO library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with the LZO library; see the file COPYING.
If not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#include "liblzo_interface.h"
/* lzo-2.03/src/config1x.h */
#define M2_MIN_LEN 3
#define M2_MAX_LEN 8
#define M3_MAX_LEN 33
#define M4_MAX_LEN 9
#define M1_MAX_OFFSET 0x0400
#define M2_MAX_OFFSET 0x0800
#define M3_MAX_OFFSET 0x4000
#define M4_MAX_OFFSET 0xbfff
#define M1_MARKER 0
#define M3_MARKER 32
#define M4_MARKER 16
#define MX_MAX_OFFSET (M1_MAX_OFFSET + M2_MAX_OFFSET)
#define MIN_LOOKAHEAD (M2_MAX_LEN + 1)
#define LZO_EOF_CODE
/* lzo-2.03/src/lzo_dict.h */
#define GINDEX(m_pos,m_off,dict,dindex,in) m_pos = dict[dindex]
#define DX2(p,s1,s2) \
(((((unsigned)((p)[2]) << (s2)) ^ (p)[1]) << (s1)) ^ (p)[0])
//#define DA3(p,s1,s2,s3) ((DA2((p)+1,s2,s3) << (s1)) + (p)[0])
//#define DS3(p,s1,s2,s3) ((DS2((p)+1,s2,s3) << (s1)) - (p)[0])
#define DX3(p,s1,s2,s3) ((DX2((p)+1,s2,s3) << (s1)) ^ (p)[0])
#define D_SIZE (1U << D_BITS)
#define D_MASK ((1U << D_BITS) - 1)
#define D_HIGH ((D_MASK >> 1) + 1)
#define LZO_CHECK_MPOS_NON_DET(m_pos,m_off,in,ip,max_offset) \
( \
m_pos = ip - (unsigned)(ip - m_pos), \
((uintptr_t)m_pos < (uintptr_t)in \
|| (m_off = (unsigned)(ip - m_pos)) <= 0 \
|| m_off > max_offset) \
)
#define DENTRY(p,in) (p)
#define UPDATE_I(dict,drun,index,p,in) dict[index] = DENTRY(p,in)
#define DMS(v,s) ((unsigned) (((v) & (D_MASK >> (s))) << (s)))
#define DM(v) ((unsigned) ((v) & D_MASK))
#define DMUL(a,b) ((unsigned) ((a) * (b)))
/* lzo-2.03/src/lzo_ptr.h */
#define pd(a,b) ((unsigned)((a)-(b)))
# define TEST_IP (ip < ip_end)
# define NEED_IP(x) \
if ((unsigned)(ip_end - ip) < (unsigned)(x)) goto input_overrun
# define TEST_IV(x) if ((x) > (unsigned)0 - (511)) goto input_overrun
# undef TEST_OP /* don't need both of the tests here */
# define TEST_OP 1
# define NEED_OP(x) \
if ((unsigned)(op_end - op) < (unsigned)(x)) goto output_overrun
# define TEST_OV(x) if ((x) > (unsigned)0 - (511)) goto output_overrun
#define HAVE_ANY_OP 1
//#if defined(LZO_TEST_OVERRUN_LOOKBEHIND)
# define TEST_LB(m_pos) if (m_pos < out || m_pos >= op) goto lookbehind_overrun
//# define TEST_LBO(m_pos,o) if (m_pos < out || m_pos >= op - (o)) goto lookbehind_overrun
//#else
//# define TEST_LB(m_pos) ((void) 0)
//# define TEST_LBO(m_pos,o) ((void) 0)
//#endif

View File

@ -0,0 +1,35 @@
/* LZO1X-1 compression
This file is part of the LZO real-time data compression library.
Copyright (C) 1996..2008 Markus Franz Xaver Johannes Oberhumer
All Rights Reserved.
Markus F.X.J. Oberhumer <markus@oberhumer.com>
http://www.oberhumer.com/opensource/lzo/
The LZO library is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.
The LZO library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with the LZO library; see the file COPYING.
If not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#include "libbb.h"
#include "liblzo.h"
#define D_BITS 14
#define D_INDEX1(d,p) d = DM(DMUL(0x21,DX3(p,5,5,6)) >> 5)
#define D_INDEX2(d,p) d = (d & (D_MASK & 0x7ff)) ^ (D_HIGH | 0x1f)
#define DO_COMPRESS lzo1x_1_compress
#include "lzo1x_c.c"

View File

@ -0,0 +1,35 @@
/* LZO1X-1(15) compression
This file is part of the LZO real-time data compression library.
Copyright (C) 1996..2008 Markus Franz Xaver Johannes Oberhumer
All Rights Reserved.
Markus F.X.J. Oberhumer <markus@oberhumer.com>
http://www.oberhumer.com/opensource/lzo/
The LZO library is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.
The LZO library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with the LZO library; see the file COPYING.
If not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#include "libbb.h"
#include "liblzo.h"
#define D_BITS 15
#define D_INDEX1(d,p) d = DM(DMUL(0x21,DX3(p,5,5,6)) >> 5)
#define D_INDEX2(d,p) d = (d & (D_MASK & 0x7ff)) ^ (D_HIGH | 0x1f)
#define DO_COMPRESS lzo1x_1_15_compress
#include "lzo1x_c.c"

View File

@ -0,0 +1,919 @@
/* lzo1x_9x.c -- implementation of the LZO1X-999 compression algorithm
This file is part of the LZO real-time data compression library.
Copyright (C) 2008 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 2007 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 2006 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 2005 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 2004 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 2003 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 2002 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 2001 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 2000 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 1999 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 1998 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 1997 Markus Franz Xaver Johannes Oberhumer
Copyright (C) 1996 Markus Franz Xaver Johannes Oberhumer
All Rights Reserved.
The LZO library is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.
The LZO library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with the LZO library; see the file COPYING.
If not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
Markus F.X.J. Oberhumer
<markus@oberhumer.com>
http://www.oberhumer.com/opensource/lzo/
*/
#include "libbb.h"
/* The following is probably only safe on Intel-compatible processors ... */
#define LZO_UNALIGNED_OK_2
#define LZO_UNALIGNED_OK_4
#include "liblzo.h"
#define LZO_MAX(a,b) ((a) >= (b) ? (a) : (b))
#define LZO_MIN(a,b) ((a) <= (b) ? (a) : (b))
#define LZO_MAX3(a,b,c) ((a) >= (b) ? LZO_MAX(a,c) : LZO_MAX(b,c))
/***********************************************************************
//
************************************************************************/
#define SWD_N M4_MAX_OFFSET /* size of ring buffer */
#define SWD_F 2048 /* upper limit for match length */
#define SWD_BEST_OFF (LZO_MAX3(M2_MAX_LEN, M3_MAX_LEN, M4_MAX_LEN) + 1)
typedef struct {
int init;
unsigned look; /* bytes in lookahead buffer */
unsigned m_len;
unsigned m_off;
const uint8_t *bp;
const uint8_t *ip;
const uint8_t *in;
const uint8_t *in_end;
uint8_t *out;
unsigned r1_lit;
} lzo1x_999_t;
#define getbyte(c) ((c).ip < (c).in_end ? *((c).ip)++ : (-1))
/* lzo_swd.c -- sliding window dictionary */
/***********************************************************************
//
************************************************************************/
#define SWD_UINT_MAX USHRT_MAX
#ifndef SWD_HSIZE
# define SWD_HSIZE 16384
#endif
#ifndef SWD_MAX_CHAIN
# define SWD_MAX_CHAIN 2048
#endif
#define HEAD3(b, p) \
( ((0x9f5f * ((((b[p]<<5)^b[p+1])<<5) ^ b[p+2])) >> 5) & (SWD_HSIZE-1) )
#if defined(LZO_UNALIGNED_OK_2)
# define HEAD2(b,p) (* (bb__aliased_uint16_t *) &(b[p]))
#else
# define HEAD2(b,p) (b[p] ^ ((unsigned)b[p+1]<<8))
#endif
#define NIL2 SWD_UINT_MAX
typedef struct lzo_swd {
/* public - "built-in" */
/* public - configuration */
unsigned max_chain;
int use_best_off;
/* public - output */
unsigned m_len;
unsigned m_off;
unsigned look;
int b_char;
#if defined(SWD_BEST_OFF)
unsigned best_off[SWD_BEST_OFF];
#endif
/* semi public */
lzo1x_999_t *c;
unsigned m_pos;
#if defined(SWD_BEST_OFF)
unsigned best_pos[SWD_BEST_OFF];
#endif
/* private */
unsigned ip; /* input pointer (lookahead) */
unsigned bp; /* buffer pointer */
unsigned rp; /* remove pointer */
unsigned node_count;
unsigned first_rp;
uint8_t b[SWD_N + SWD_F];
uint8_t b_wrap[SWD_F]; /* must follow b */
uint16_t head3[SWD_HSIZE];
uint16_t succ3[SWD_N + SWD_F];
uint16_t best3[SWD_N + SWD_F];
uint16_t llen3[SWD_HSIZE];
#ifdef HEAD2
uint16_t head2[65536L];
#endif
} lzo_swd_t, *lzo_swd_p;
#define SIZEOF_LZO_SWD_T (sizeof(lzo_swd_t))
/* Access macro for head3.
* head3[key] may be uninitialized, but then its value will never be used.
*/
#define s_get_head3(s,key) s->head3[key]
/***********************************************************************
//
************************************************************************/
#define B_SIZE (SWD_N + SWD_F)
static int swd_init(lzo_swd_p s)
{
/* defaults */
s->node_count = SWD_N;
memset(s->llen3, 0, sizeof(s->llen3[0]) * (unsigned)SWD_HSIZE);
#ifdef HEAD2
memset(s->head2, 0xff, sizeof(s->head2[0]) * 65536L);
assert(s->head2[0] == NIL2);
#endif
s->ip = 0;
s->bp = s->ip;
s->first_rp = s->ip;
assert(s->ip + SWD_F <= B_SIZE);
s->look = (unsigned) (s->c->in_end - s->c->ip);
if (s->look > 0) {
if (s->look > SWD_F)
s->look = SWD_F;
memcpy(&s->b[s->ip], s->c->ip, s->look);
s->c->ip += s->look;
s->ip += s->look;
}
if (s->ip == B_SIZE)
s->ip = 0;
s->rp = s->first_rp;
if (s->rp >= s->node_count)
s->rp -= s->node_count;
else
s->rp += B_SIZE - s->node_count;
return LZO_E_OK;
}
#define swd_pos2off(s,pos) \
(s->bp > (pos) ? s->bp - (pos) : B_SIZE - ((pos) - s->bp))
/***********************************************************************
//
************************************************************************/
static void swd_getbyte(lzo_swd_p s)
{
int c;
if ((c = getbyte(*(s->c))) < 0) {
if (s->look > 0)
--s->look;
} else {
s->b[s->ip] = c;
if (s->ip < SWD_F)
s->b_wrap[s->ip] = c;
}
if (++s->ip == B_SIZE)
s->ip = 0;
if (++s->bp == B_SIZE)
s->bp = 0;
if (++s->rp == B_SIZE)
s->rp = 0;
}
/***********************************************************************
// remove node from lists
************************************************************************/
static void swd_remove_node(lzo_swd_p s, unsigned node)
{
if (s->node_count == 0) {
unsigned key;
key = HEAD3(s->b,node);
assert(s->llen3[key] > 0);
--s->llen3[key];
#ifdef HEAD2
key = HEAD2(s->b,node);
assert(s->head2[key] != NIL2);
if ((unsigned) s->head2[key] == node)
s->head2[key] = NIL2;
#endif
} else
--s->node_count;
}
/***********************************************************************
//
************************************************************************/
static void swd_accept(lzo_swd_p s, unsigned n)
{
assert(n <= s->look);
while (n--) {
unsigned key;
swd_remove_node(s,s->rp);
/* add bp into HEAD3 */
key = HEAD3(s->b, s->bp);
s->succ3[s->bp] = s_get_head3(s, key);
s->head3[key] = s->bp;
s->best3[s->bp] = SWD_F + 1;
s->llen3[key]++;
assert(s->llen3[key] <= SWD_N);
#ifdef HEAD2
/* add bp into HEAD2 */
key = HEAD2(s->b, s->bp);
s->head2[key] = s->bp;
#endif
swd_getbyte(s);
}
}
/***********************************************************************
//
************************************************************************/
static void swd_search(lzo_swd_p s, unsigned node, unsigned cnt)
{
const uint8_t *p1;
const uint8_t *p2;
const uint8_t *px;
unsigned m_len = s->m_len;
const uint8_t *b = s->b;
const uint8_t *bp = s->b + s->bp;
const uint8_t *bx = s->b + s->bp + s->look;
unsigned char scan_end1;
assert(s->m_len > 0);
scan_end1 = bp[m_len - 1];
for ( ; cnt-- > 0; node = s->succ3[node]) {
p1 = bp;
p2 = b + node;
px = bx;
assert(m_len < s->look);
if (p2[m_len - 1] == scan_end1
&& p2[m_len] == p1[m_len]
&& p2[0] == p1[0]
&& p2[1] == p1[1]
) {
unsigned i;
assert(lzo_memcmp(bp, &b[node], 3) == 0);
p1 += 2; p2 += 2;
do {} while (++p1 < px && *p1 == *++p2);
i = p1-bp;
assert(lzo_memcmp(bp, &b[node], i) == 0);
#if defined(SWD_BEST_OFF)
if (i < SWD_BEST_OFF) {
if (s->best_pos[i] == 0)
s->best_pos[i] = node + 1;
}
#endif
if (i > m_len) {
s->m_len = m_len = i;
s->m_pos = node;
if (m_len == s->look)
return;
if (m_len >= SWD_F)
return;
if (m_len > (unsigned) s->best3[node])
return;
scan_end1 = bp[m_len - 1];
}
}
}
}
/***********************************************************************
//
************************************************************************/
#ifdef HEAD2
static int swd_search2(lzo_swd_p s)
{
unsigned key;
assert(s->look >= 2);
assert(s->m_len > 0);
key = s->head2[HEAD2(s->b, s->bp)];
if (key == NIL2)
return 0;
assert(lzo_memcmp(&s->b[s->bp], &s->b[key], 2) == 0);
#if defined(SWD_BEST_OFF)
if (s->best_pos[2] == 0)
s->best_pos[2] = key + 1;
#endif
if (s->m_len < 2) {
s->m_len = 2;
s->m_pos = key;
}
return 1;
}
#endif
/***********************************************************************
//
************************************************************************/
static void swd_findbest(lzo_swd_p s)
{
unsigned key;
unsigned cnt, node;
unsigned len;
assert(s->m_len > 0);
/* get current head, add bp into HEAD3 */
key = HEAD3(s->b,s->bp);
node = s->succ3[s->bp] = s_get_head3(s, key);
cnt = s->llen3[key]++;
assert(s->llen3[key] <= SWD_N + SWD_F);
if (cnt > s->max_chain)
cnt = s->max_chain;
s->head3[key] = s->bp;
s->b_char = s->b[s->bp];
len = s->m_len;
if (s->m_len >= s->look) {
if (s->look == 0)
s->b_char = -1;
s->m_off = 0;
s->best3[s->bp] = SWD_F + 1;
} else {
#ifdef HEAD2
if (swd_search2(s))
#endif
if (s->look >= 3)
swd_search(s, node, cnt);
if (s->m_len > len)
s->m_off = swd_pos2off(s,s->m_pos);
s->best3[s->bp] = s->m_len;
#if defined(SWD_BEST_OFF)
if (s->use_best_off) {
int i;
for (i = 2; i < SWD_BEST_OFF; i++) {
if (s->best_pos[i] > 0)
s->best_off[i] = swd_pos2off(s, s->best_pos[i]-1);
else
s->best_off[i] = 0;
}
}
#endif
}
swd_remove_node(s,s->rp);
#ifdef HEAD2
/* add bp into HEAD2 */
key = HEAD2(s->b, s->bp);
s->head2[key] = s->bp;
#endif
}
#undef HEAD3
#undef HEAD2
#undef s_get_head3
/***********************************************************************
//
************************************************************************/
static int init_match(lzo1x_999_t *c, lzo_swd_p s, uint32_t use_best_off)
{
int r;
assert(!c->init);
c->init = 1;
s->c = c;
r = swd_init(s);
if (r != 0)
return r;
s->use_best_off = use_best_off;
return r;
}
/***********************************************************************
//
************************************************************************/
static int find_match(lzo1x_999_t *c, lzo_swd_p s,
unsigned this_len, unsigned skip)
{
assert(c->init);
if (skip > 0) {
assert(this_len >= skip);
swd_accept(s, this_len - skip);
} else {
assert(this_len <= 1);
}
s->m_len = 1;
#ifdef SWD_BEST_OFF
if (s->use_best_off)
memset(s->best_pos, 0, sizeof(s->best_pos));
#endif
swd_findbest(s);
c->m_len = s->m_len;
c->m_off = s->m_off;
swd_getbyte(s);
if (s->b_char < 0) {
c->look = 0;
c->m_len = 0;
} else {
c->look = s->look + 1;
}
c->bp = c->ip - c->look;
return LZO_E_OK;
}
/* this is a public functions, but there is no prototype in a header file */
static int lzo1x_999_compress_internal(const uint8_t *in, unsigned in_len,
uint8_t *out, unsigned *out_len,
void *wrkmem,
unsigned good_length,
unsigned max_lazy,
unsigned max_chain,
uint32_t use_best_off);
/***********************************************************************
//
************************************************************************/
static uint8_t *code_match(lzo1x_999_t *c,
uint8_t *op, unsigned m_len, unsigned m_off)
{
assert(op > c->out);
if (m_len == 2) {
assert(m_off <= M1_MAX_OFFSET);
assert(c->r1_lit > 0);
assert(c->r1_lit < 4);
m_off -= 1;
*op++ = M1_MARKER | ((m_off & 3) << 2);
*op++ = m_off >> 2;
} else if (m_len <= M2_MAX_LEN && m_off <= M2_MAX_OFFSET) {
assert(m_len >= 3);
m_off -= 1;
*op++ = ((m_len - 1) << 5) | ((m_off & 7) << 2);
*op++ = m_off >> 3;
assert(op[-2] >= M2_MARKER);
} else if (m_len == M2_MIN_LEN && m_off <= MX_MAX_OFFSET && c->r1_lit >= 4) {
assert(m_len == 3);
assert(m_off > M2_MAX_OFFSET);
m_off -= 1 + M2_MAX_OFFSET;
*op++ = M1_MARKER | ((m_off & 3) << 2);
*op++ = m_off >> 2;
} else if (m_off <= M3_MAX_OFFSET) {
assert(m_len >= 3);
m_off -= 1;
if (m_len <= M3_MAX_LEN)
*op++ = M3_MARKER | (m_len - 2);
else {
m_len -= M3_MAX_LEN;
*op++ = M3_MARKER | 0;
while (m_len > 255) {
m_len -= 255;
*op++ = 0;
}
assert(m_len > 0);
*op++ = m_len;
}
*op++ = m_off << 2;
*op++ = m_off >> 6;
} else {
unsigned k;
assert(m_len >= 3);
assert(m_off > 0x4000);
assert(m_off <= 0xbfff);
m_off -= 0x4000;
k = (m_off & 0x4000) >> 11;
if (m_len <= M4_MAX_LEN)
*op++ = M4_MARKER | k | (m_len - 2);
else {
m_len -= M4_MAX_LEN;
*op++ = M4_MARKER | k | 0;
while (m_len > 255) {
m_len -= 255;
*op++ = 0;
}
assert(m_len > 0);
*op++ = m_len;
}
*op++ = m_off << 2;
*op++ = m_off >> 6;
}
return op;
}
static uint8_t *STORE_RUN(lzo1x_999_t *c, uint8_t *op,
const uint8_t *ii, unsigned t)
{
if (op == c->out && t <= 238) {
*op++ = 17 + t;
} else if (t <= 3) {
op[-2] |= t;
} else if (t <= 18) {
*op++ = t - 3;
} else {
unsigned tt = t - 18;
*op++ = 0;
while (tt > 255) {
tt -= 255;
*op++ = 0;
}
assert(tt > 0);
*op++ = tt;
}
do *op++ = *ii++; while (--t > 0);
return op;
}
static uint8_t *code_run(lzo1x_999_t *c, uint8_t *op, const uint8_t *ii,
unsigned lit)
{
if (lit > 0) {
assert(m_len >= 2);
op = STORE_RUN(c, op, ii, lit);
} else {
assert(m_len >= 3);
}
c->r1_lit = lit;
return op;
}
/***********************************************************************
//
************************************************************************/
static int len_of_coded_match(unsigned m_len, unsigned m_off, unsigned lit)
{
int n = 4;
if (m_len < 2)
return -1;
if (m_len == 2)
return (m_off <= M1_MAX_OFFSET && lit > 0 && lit < 4) ? 2 : -1;
if (m_len <= M2_MAX_LEN && m_off <= M2_MAX_OFFSET)
return 2;
if (m_len == M2_MIN_LEN && m_off <= MX_MAX_OFFSET && lit >= 4)
return 2;
if (m_off <= M3_MAX_OFFSET) {
if (m_len <= M3_MAX_LEN)
return 3;
m_len -= M3_MAX_LEN;
} else if (m_off <= M4_MAX_OFFSET) {
if (m_len <= M4_MAX_LEN)
return 3;
m_len -= M4_MAX_LEN;
} else
return -1;
while (m_len > 255) {
m_len -= 255;
n++;
}
return n;
}
static int min_gain(unsigned ahead, unsigned lit1,
unsigned lit2, int l1, int l2, int l3)
{
int lazy_match_min_gain = 0;
assert (ahead >= 1);
lazy_match_min_gain += ahead;
if (lit1 <= 3)
lazy_match_min_gain += (lit2 <= 3) ? 0 : 2;
else if (lit1 <= 18)
lazy_match_min_gain += (lit2 <= 18) ? 0 : 1;
lazy_match_min_gain += (l2 - l1) * 2;
if (l3 > 0)
lazy_match_min_gain -= (ahead - l3) * 2;
if (lazy_match_min_gain < 0)
lazy_match_min_gain = 0;
return lazy_match_min_gain;
}
/***********************************************************************
//
************************************************************************/
#if defined(SWD_BEST_OFF)
static void better_match(const lzo_swd_p swd,
unsigned *m_len, unsigned *m_off)
{
if (*m_len <= M2_MIN_LEN)
return;
if (*m_off <= M2_MAX_OFFSET)
return;
/* M3/M4 -> M2 */
if (*m_off > M2_MAX_OFFSET
&& *m_len >= M2_MIN_LEN + 1 && *m_len <= M2_MAX_LEN + 1
&& swd->best_off[*m_len-1] && swd->best_off[*m_len-1] <= M2_MAX_OFFSET
) {
*m_len = *m_len - 1;
*m_off = swd->best_off[*m_len];
return;
}
/* M4 -> M2 */
if (*m_off > M3_MAX_OFFSET
&& *m_len >= M4_MAX_LEN + 1 && *m_len <= M2_MAX_LEN + 2
&& swd->best_off[*m_len-2] && swd->best_off[*m_len-2] <= M2_MAX_OFFSET
) {
*m_len = *m_len - 2;
*m_off = swd->best_off[*m_len];
return;
}
/* M4 -> M3 */
if (*m_off > M3_MAX_OFFSET
&& *m_len >= M4_MAX_LEN + 1 && *m_len <= M3_MAX_LEN + 1
&& swd->best_off[*m_len-1] && swd->best_off[*m_len-1] <= M3_MAX_OFFSET
) {
*m_len = *m_len - 1;
*m_off = swd->best_off[*m_len];
}
}
#endif
/***********************************************************************
//
************************************************************************/
static int lzo1x_999_compress_internal(const uint8_t *in, unsigned in_len,
uint8_t *out, unsigned *out_len,
void *wrkmem,
unsigned good_length,
unsigned max_lazy,
unsigned max_chain,
uint32_t use_best_off)
{
uint8_t *op;
const uint8_t *ii;
unsigned lit;
unsigned m_len, m_off;
lzo1x_999_t cc;
lzo1x_999_t *const c = &cc;
const lzo_swd_p swd = (lzo_swd_p) wrkmem;
int r;
c->init = 0;
c->ip = c->in = in;
c->in_end = in + in_len;
c->out = out;
op = out;
ii = c->ip; /* point to start of literal run */
lit = 0;
c->r1_lit = 0;
r = init_match(c, swd, use_best_off);
if (r != 0)
return r;
swd->max_chain = max_chain;
r = find_match(c, swd, 0, 0);
if (r != 0)
return r;
while (c->look > 0) {
unsigned ahead;
unsigned max_ahead;
int l1, l2, l3;
m_len = c->m_len;
m_off = c->m_off;
assert(c->bp == c->ip - c->look);
assert(c->bp >= in);
if (lit == 0)
ii = c->bp;
assert(ii + lit == c->bp);
assert(swd->b_char == *(c->bp));
if (m_len < 2
|| (m_len == 2 && (m_off > M1_MAX_OFFSET || lit == 0 || lit >= 4))
/* Do not accept this match for compressed-data compatibility
* with LZO v1.01 and before
* [ might be a problem for decompress() and optimize() ]
*/
|| (m_len == 2 && op == out)
|| (op == out && lit == 0)
) {
/* a literal */
m_len = 0;
}
else if (m_len == M2_MIN_LEN) {
/* compression ratio improves if we code a literal in some cases */
if (m_off > MX_MAX_OFFSET && lit >= 4)
m_len = 0;
}
if (m_len == 0) {
/* a literal */
lit++;
swd->max_chain = max_chain;
r = find_match(c, swd, 1, 0);
assert(r == 0);
continue;
}
/* a match */
#if defined(SWD_BEST_OFF)
if (swd->use_best_off)
better_match(swd, &m_len, &m_off);
#endif
/* shall we try a lazy match ? */
ahead = 0;
if (m_len >= max_lazy) {
/* no */
l1 = 0;
max_ahead = 0;
} else {
/* yes, try a lazy match */
l1 = len_of_coded_match(m_len, m_off, lit);
assert(l1 > 0);
max_ahead = LZO_MIN(2, (unsigned)l1 - 1);
}
while (ahead < max_ahead && c->look > m_len) {
int lazy_match_min_gain;
if (m_len >= good_length)
swd->max_chain = max_chain >> 2;
else
swd->max_chain = max_chain;
r = find_match(c, swd, 1, 0);
ahead++;
assert(r == 0);
assert(c->look > 0);
assert(ii + lit + ahead == c->bp);
if (c->m_len < m_len)
continue;
if (c->m_len == m_len && c->m_off >= m_off)
continue;
#if defined(SWD_BEST_OFF)
if (swd->use_best_off)
better_match(swd, &c->m_len, &c->m_off);
#endif
l2 = len_of_coded_match(c->m_len, c->m_off, lit+ahead);
if (l2 < 0)
continue;
/* compressed-data compatibility [see above] */
l3 = (op == out) ? -1 : len_of_coded_match(ahead, m_off, lit);
lazy_match_min_gain = min_gain(ahead, lit, lit+ahead, l1, l2, l3);
if (c->m_len >= m_len + lazy_match_min_gain) {
if (l3 > 0) {
/* code previous run */
op = code_run(c, op, ii, lit);
lit = 0;
/* code shortened match */
op = code_match(c, op, ahead, m_off);
} else {
lit += ahead;
assert(ii + lit == c->bp);
}
goto lazy_match_done;
}
}
assert(ii + lit + ahead == c->bp);
/* 1 - code run */
op = code_run(c, op, ii, lit);
lit = 0;
/* 2 - code match */
op = code_match(c, op, m_len, m_off);
swd->max_chain = max_chain;
r = find_match(c, swd, m_len, 1+ahead);
assert(r == 0);
lazy_match_done: ;
}
/* store final run */
if (lit > 0)
op = STORE_RUN(c, op, ii, lit);
#if defined(LZO_EOF_CODE)
*op++ = M4_MARKER | 1;
*op++ = 0;
*op++ = 0;
#endif
*out_len = op - out;
return LZO_E_OK;
}
/***********************************************************************
//
************************************************************************/
int lzo1x_999_compress_level(const uint8_t *in, unsigned in_len,
uint8_t *out, unsigned *out_len,
void *wrkmem,
int compression_level)
{
static const struct {
uint16_t good_length;
uint16_t max_lazy;
uint16_t max_chain;
uint16_t use_best_off;
} c[3] = {
{ 8, 32, 256, 0 },
{ 32, 128, 2048, 1 },
{ SWD_F, SWD_F, 4096, 1 } /* max. compression */
};
if (compression_level < 7 || compression_level > 9)
return LZO_E_ERROR;
compression_level -= 7;
return lzo1x_999_compress_internal(in, in_len, out, out_len, wrkmem,
c[compression_level].good_length,
c[compression_level].max_lazy,
c[compression_level].max_chain,
c[compression_level].use_best_off);
}

View File

@ -0,0 +1,296 @@
/* implementation of the LZO1[XY]-1 compression algorithm
This file is part of the LZO real-time data compression library.
Copyright (C) 1996..2008 Markus Franz Xaver Johannes Oberhumer
All Rights Reserved.
Markus F.X.J. Oberhumer <markus@oberhumer.com>
http://www.oberhumer.com/opensource/lzo/
The LZO library is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.
The LZO library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with the LZO library; see the file COPYING.
If not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*/
/***********************************************************************
// compress a block of data.
************************************************************************/
static NOINLINE unsigned
do_compress(const uint8_t* in, unsigned in_len,
uint8_t* out, unsigned* out_len,
void* wrkmem)
{
register const uint8_t* ip;
uint8_t* op;
const uint8_t* const in_end = in + in_len;
const uint8_t* const ip_end = in + in_len - M2_MAX_LEN - 5;
const uint8_t* ii;
const void* *const dict = (const void**) wrkmem;
op = out;
ip = in;
ii = ip;
ip += 4;
for (;;) {
register const uint8_t* m_pos;
unsigned m_off;
unsigned m_len;
unsigned dindex;
D_INDEX1(dindex,ip);
GINDEX(m_pos,m_off,dict,dindex,in);
if (LZO_CHECK_MPOS_NON_DET(m_pos,m_off,in,ip,M4_MAX_OFFSET))
goto literal;
#if 1
if (m_off <= M2_MAX_OFFSET || m_pos[3] == ip[3])
goto try_match;
D_INDEX2(dindex,ip);
#endif
GINDEX(m_pos,m_off,dict,dindex,in);
if (LZO_CHECK_MPOS_NON_DET(m_pos,m_off,in,ip,M4_MAX_OFFSET))
goto literal;
if (m_off <= M2_MAX_OFFSET || m_pos[3] == ip[3])
goto try_match;
goto literal;
try_match:
#if 1 && defined(LZO_UNALIGNED_OK_2)
if (* (const lzo_ushortp) m_pos != * (const lzo_ushortp) ip)
#else
if (m_pos[0] != ip[0] || m_pos[1] != ip[1])
#endif
{
} else {
if (m_pos[2] == ip[2]) {
#if 0
if (m_off <= M2_MAX_OFFSET)
goto match;
if (lit <= 3)
goto match;
if (lit == 3) { /* better compression, but slower */
assert(op - 2 > out); op[-2] |= (uint8_t)(3);
*op++ = *ii++; *op++ = *ii++; *op++ = *ii++;
goto code_match;
}
if (m_pos[3] == ip[3])
#endif
goto match;
}
else {
/* still need a better way for finding M1 matches */
#if 0
/* a M1 match */
#if 0
if (m_off <= M1_MAX_OFFSET && lit > 0 && lit <= 3)
#else
if (m_off <= M1_MAX_OFFSET && lit == 3)
#endif
{
register unsigned t;
t = lit;
assert(op - 2 > out); op[-2] |= (uint8_t)(t);
do *op++ = *ii++; while (--t > 0);
assert(ii == ip);
m_off -= 1;
*op++ = (uint8_t)(M1_MARKER | ((m_off & 3) << 2));
*op++ = (uint8_t)(m_off >> 2);
ip += 2;
goto match_done;
}
#endif
}
}
/* a literal */
literal:
UPDATE_I(dict, 0, dindex, ip, in);
++ip;
if (ip >= ip_end)
break;
continue;
/* a match */
match:
UPDATE_I(dict, 0, dindex, ip, in);
/* store current literal run */
if (pd(ip, ii) > 0) {
register unsigned t = pd(ip, ii);
if (t <= 3) {
assert(op - 2 > out);
op[-2] |= (uint8_t)(t);
}
else if (t <= 18)
*op++ = (uint8_t)(t - 3);
else {
register unsigned tt = t - 18;
*op++ = 0;
while (tt > 255) {
tt -= 255;
*op++ = 0;
}
assert(tt > 0);
*op++ = (uint8_t)(tt);
}
do *op++ = *ii++; while (--t > 0);
}
/* code the match */
assert(ii == ip);
ip += 3;
if (m_pos[3] != *ip++ || m_pos[4] != *ip++ || m_pos[5] != *ip++
|| m_pos[6] != *ip++ || m_pos[7] != *ip++ || m_pos[8] != *ip++
#ifdef LZO1Y
|| m_pos[ 9] != *ip++ || m_pos[10] != *ip++ || m_pos[11] != *ip++
|| m_pos[12] != *ip++ || m_pos[13] != *ip++ || m_pos[14] != *ip++
#endif
) {
--ip;
m_len = pd(ip, ii);
assert(m_len >= 3);
assert(m_len <= M2_MAX_LEN);
if (m_off <= M2_MAX_OFFSET) {
m_off -= 1;
#if defined(LZO1X)
*op++ = (uint8_t)(((m_len - 1) << 5) | ((m_off & 7) << 2));
*op++ = (uint8_t)(m_off >> 3);
#elif defined(LZO1Y)
*op++ = (uint8_t)(((m_len + 1) << 4) | ((m_off & 3) << 2));
*op++ = (uint8_t)(m_off >> 2);
#endif
}
else if (m_off <= M3_MAX_OFFSET) {
m_off -= 1;
*op++ = (uint8_t)(M3_MARKER | (m_len - 2));
goto m3_m4_offset;
} else {
#if defined(LZO1X)
m_off -= 0x4000;
assert(m_off > 0);
assert(m_off <= 0x7fff);
*op++ = (uint8_t)(M4_MARKER | ((m_off & 0x4000) >> 11) | (m_len - 2));
goto m3_m4_offset;
#elif defined(LZO1Y)
goto m4_match;
#endif
}
}
else {
{
const uint8_t* end = in_end;
const uint8_t* m = m_pos + M2_MAX_LEN + 1;
while (ip < end && *m == *ip)
m++, ip++;
m_len = pd(ip, ii);
}
assert(m_len > M2_MAX_LEN);
if (m_off <= M3_MAX_OFFSET) {
m_off -= 1;
if (m_len <= 33)
*op++ = (uint8_t)(M3_MARKER | (m_len - 2));
else {
m_len -= 33;
*op++ = M3_MARKER | 0;
goto m3_m4_len;
}
} else {
#if defined(LZO1Y)
m4_match:
#endif
m_off -= 0x4000;
assert(m_off > 0);
assert(m_off <= 0x7fff);
if (m_len <= M4_MAX_LEN)
*op++ = (uint8_t)(M4_MARKER | ((m_off & 0x4000) >> 11) | (m_len - 2));
else {
m_len -= M4_MAX_LEN;
*op++ = (uint8_t)(M4_MARKER | ((m_off & 0x4000) >> 11));
m3_m4_len:
while (m_len > 255) {
m_len -= 255;
*op++ = 0;
}
assert(m_len > 0);
*op++ = (uint8_t)(m_len);
}
}
m3_m4_offset:
*op++ = (uint8_t)((m_off & 63) << 2);
*op++ = (uint8_t)(m_off >> 6);
}
#if 0
match_done:
#endif
ii = ip;
if (ip >= ip_end)
break;
}
*out_len = pd(op, out);
return pd(in_end, ii);
}
/***********************************************************************
// public entry point
************************************************************************/
int DO_COMPRESS(const uint8_t* in, unsigned in_len,
uint8_t* out, unsigned* out_len,
void* wrkmem)
{
uint8_t* op = out;
unsigned t;
if (in_len <= M2_MAX_LEN + 5)
t = in_len;
else {
t = do_compress(in,in_len,op,out_len,wrkmem);
op += *out_len;
}
if (t > 0) {
const uint8_t* ii = in + in_len - t;
if (op == out && t <= 238)
*op++ = (uint8_t)(17 + t);
else if (t <= 3)
op[-2] |= (uint8_t)(t);
else if (t <= 18)
*op++ = (uint8_t)(t - 3);
else {
unsigned tt = t - 18;
*op++ = 0;
while (tt > 255) {
tt -= 255;
*op++ = 0;
}
assert(tt > 0);
*op++ = (uint8_t)(tt);
}
do *op++ = *ii++; while (--t > 0);
}
*op++ = M4_MARKER | 1;
*op++ = 0;
*op++ = 0;
*out_len = pd(op, out);
return 0; /*LZO_E_OK*/
}

View File

@ -0,0 +1,422 @@
/* implementation of the LZO1X decompression algorithm
This file is part of the LZO real-time data compression library.
Copyright (C) 1996..2008 Markus Franz Xaver Johannes Oberhumer
All Rights Reserved.
Markus F.X.J. Oberhumer <markus@oberhumer.com>
http://www.oberhumer.com/opensource/lzo/
The LZO library is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.
The LZO library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with the LZO library; see the file COPYING.
If not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#include "libbb.h"
#include "liblzo.h"
/***********************************************************************
// decompress a block of data.
************************************************************************/
/* safe decompression with overrun testing */
int lzo1x_decompress_safe(const uint8_t* in, unsigned in_len,
uint8_t* out, unsigned* out_len /*, void* wrkmem */)
{
register uint8_t* op;
register const uint8_t* ip;
register unsigned t;
#if defined(COPY_DICT)
unsigned m_off;
const uint8_t* dict_end;
#else
register const uint8_t* m_pos = NULL; /* possibly not needed */
#endif
const uint8_t* const ip_end = in + in_len;
#if defined(HAVE_ANY_OP)
uint8_t* const op_end = out + *out_len;
#endif
#if defined(LZO1Z)
unsigned last_m_off = 0;
#endif
// LZO_UNUSED(wrkmem);
#if defined(COPY_DICT)
if (dict) {
if (dict_len > M4_MAX_OFFSET) {
dict += dict_len - M4_MAX_OFFSET;
dict_len = M4_MAX_OFFSET;
}
dict_end = dict + dict_len;
} else {
dict_len = 0;
dict_end = NULL;
}
#endif /* COPY_DICT */
*out_len = 0;
op = out;
ip = in;
if (*ip > 17) {
t = *ip++ - 17;
if (t < 4)
goto match_next;
assert(t > 0); NEED_OP(t); NEED_IP(t+1);
do *op++ = *ip++; while (--t > 0);
goto first_literal_run;
}
while (TEST_IP && TEST_OP) {
t = *ip++;
if (t >= 16)
goto match;
/* a literal run */
if (t == 0) {
NEED_IP(1);
while (*ip == 0) {
t += 255;
ip++;
NEED_IP(1);
}
TEST_IV(t);
t += 15 + *ip++;
}
/* copy literals */
assert(t > 0);
NEED_OP(t+3);
NEED_IP(t+4);
#if defined(LZO_UNALIGNED_OK_4) || defined(LZO_ALIGNED_OK_4)
# if !defined(LZO_UNALIGNED_OK_4)
if (PTR_ALIGNED2_4(op, ip))
# endif
{
COPY4(op, ip);
op += 4;
ip += 4;
if (--t > 0) {
if (t >= 4) {
do {
COPY4(op, ip);
op += 4;
ip += 4;
t -= 4;
} while (t >= 4);
if (t > 0)
do *op++ = *ip++; while (--t > 0);
} else {
do *op++ = *ip++; while (--t > 0);
}
}
}
# if !defined(LZO_UNALIGNED_OK_4)
else
# endif
#endif
#if !defined(LZO_UNALIGNED_OK_4)
{
*op++ = *ip++;
*op++ = *ip++;
*op++ = *ip++;
do *op++ = *ip++; while (--t > 0);
}
#endif
first_literal_run:
t = *ip++;
if (t >= 16)
goto match;
#if defined(COPY_DICT)
#if defined(LZO1Z)
m_off = (1 + M2_MAX_OFFSET) + (t << 6) + (*ip++ >> 2);
last_m_off = m_off;
#else
m_off = (1 + M2_MAX_OFFSET) + (t >> 2) + (*ip++ << 2);
#endif
NEED_OP(3);
t = 3; COPY_DICT(t,m_off)
#else /* !COPY_DICT */
#if defined(LZO1Z)
t = (1 + M2_MAX_OFFSET) + (t << 6) + (*ip++ >> 2);
m_pos = op - t;
last_m_off = t;
#else
m_pos = op - (1 + M2_MAX_OFFSET);
m_pos -= t >> 2;
m_pos -= *ip++ << 2;
#endif
TEST_LB(m_pos); NEED_OP(3);
*op++ = *m_pos++;
*op++ = *m_pos++;
*op++ = *m_pos;
#endif /* COPY_DICT */
goto match_done;
/* handle matches */
do {
match:
if (t >= 64) { /* a M2 match */
#if defined(COPY_DICT)
#if defined(LZO1X)
m_off = 1 + ((t >> 2) & 7) + (*ip++ << 3);
t = (t >> 5) - 1;
#elif defined(LZO1Y)
m_off = 1 + ((t >> 2) & 3) + (*ip++ << 2);
t = (t >> 4) - 3;
#elif defined(LZO1Z)
m_off = t & 0x1f;
if (m_off >= 0x1c)
m_off = last_m_off;
else {
m_off = 1 + (m_off << 6) + (*ip++ >> 2);
last_m_off = m_off;
}
t = (t >> 5) - 1;
#endif
#else /* !COPY_DICT */
#if defined(LZO1X)
m_pos = op - 1;
m_pos -= (t >> 2) & 7;
m_pos -= *ip++ << 3;
t = (t >> 5) - 1;
#elif defined(LZO1Y)
m_pos = op - 1;
m_pos -= (t >> 2) & 3;
m_pos -= *ip++ << 2;
t = (t >> 4) - 3;
#elif defined(LZO1Z)
{
unsigned off = t & 0x1f;
m_pos = op;
if (off >= 0x1c) {
assert(last_m_off > 0);
m_pos -= last_m_off;
} else {
off = 1 + (off << 6) + (*ip++ >> 2);
m_pos -= off;
last_m_off = off;
}
}
t = (t >> 5) - 1;
#endif
TEST_LB(m_pos); assert(t > 0); NEED_OP(t+3-1);
goto copy_match;
#endif /* COPY_DICT */
}
else if (t >= 32) { /* a M3 match */
t &= 31;
if (t == 0) {
NEED_IP(1);
while (*ip == 0) {
t += 255;
ip++;
NEED_IP(1);
}
TEST_IV(t);
t += 31 + *ip++;
}
#if defined(COPY_DICT)
#if defined(LZO1Z)
m_off = 1 + (ip[0] << 6) + (ip[1] >> 2);
last_m_off = m_off;
#else
m_off = 1 + (ip[0] >> 2) + (ip[1] << 6);
#endif
#else /* !COPY_DICT */
#if defined(LZO1Z)
{
unsigned off = 1 + (ip[0] << 6) + (ip[1] >> 2);
m_pos = op - off;
last_m_off = off;
}
#elif defined(LZO_UNALIGNED_OK_2) && defined(LZO_ABI_LITTLE_ENDIAN)
m_pos = op - 1;
m_pos -= (* (const lzo_ushortp) ip) >> 2;
#else
m_pos = op - 1;
m_pos -= (ip[0] >> 2) + (ip[1] << 6);
#endif
#endif /* COPY_DICT */
ip += 2;
}
else if (t >= 16) { /* a M4 match */
#if defined(COPY_DICT)
m_off = (t & 8) << 11;
#else /* !COPY_DICT */
m_pos = op;
m_pos -= (t & 8) << 11;
#endif /* COPY_DICT */
t &= 7;
if (t == 0) {
NEED_IP(1);
while (*ip == 0) {
t += 255;
ip++;
NEED_IP(1);
}
TEST_IV(t);
t += 7 + *ip++;
}
#if defined(COPY_DICT)
#if defined(LZO1Z)
m_off += (ip[0] << 6) + (ip[1] >> 2);
#else
m_off += (ip[0] >> 2) + (ip[1] << 6);
#endif
ip += 2;
if (m_off == 0)
goto eof_found;
m_off += 0x4000;
#if defined(LZO1Z)
last_m_off = m_off;
#endif
#else /* !COPY_DICT */
#if defined(LZO1Z)
m_pos -= (ip[0] << 6) + (ip[1] >> 2);
#elif defined(LZO_UNALIGNED_OK_2) && defined(LZO_ABI_LITTLE_ENDIAN)
m_pos -= (* (const lzo_ushortp) ip) >> 2;
#else
m_pos -= (ip[0] >> 2) + (ip[1] << 6);
#endif
ip += 2;
if (m_pos == op)
goto eof_found;
m_pos -= 0x4000;
#if defined(LZO1Z)
last_m_off = pd((const uint8_t*)op, m_pos);
#endif
#endif /* COPY_DICT */
}
else { /* a M1 match */
#if defined(COPY_DICT)
#if defined(LZO1Z)
m_off = 1 + (t << 6) + (*ip++ >> 2);
last_m_off = m_off;
#else
m_off = 1 + (t >> 2) + (*ip++ << 2);
#endif
NEED_OP(2);
t = 2; COPY_DICT(t,m_off)
#else /* !COPY_DICT */
#if defined(LZO1Z)
t = 1 + (t << 6) + (*ip++ >> 2);
m_pos = op - t;
last_m_off = t;
#else
m_pos = op - 1;
m_pos -= t >> 2;
m_pos -= *ip++ << 2;
#endif
TEST_LB(m_pos); NEED_OP(2);
*op++ = *m_pos++;
*op++ = *m_pos;
#endif /* COPY_DICT */
goto match_done;
}
/* copy match */
#if defined(COPY_DICT)
NEED_OP(t+3-1);
t += 3-1; COPY_DICT(t,m_off)
#else /* !COPY_DICT */
TEST_LB(m_pos); assert(t > 0); NEED_OP(t+3-1);
#if defined(LZO_UNALIGNED_OK_4) || defined(LZO_ALIGNED_OK_4)
# if !defined(LZO_UNALIGNED_OK_4)
if (t >= 2 * 4 - (3 - 1) && PTR_ALIGNED2_4(op,m_pos)) {
assert((op - m_pos) >= 4); /* both pointers are aligned */
# else
if (t >= 2 * 4 - (3 - 1) && (op - m_pos) >= 4) {
# endif
COPY4(op,m_pos);
op += 4; m_pos += 4; t -= 4 - (3 - 1);
do {
COPY4(op,m_pos);
op += 4; m_pos += 4; t -= 4;
} while (t >= 4);
if (t > 0)
do *op++ = *m_pos++; while (--t > 0);
}
else
#endif
{
copy_match:
*op++ = *m_pos++; *op++ = *m_pos++;
do *op++ = *m_pos++; while (--t > 0);
}
#endif /* COPY_DICT */
match_done:
#if defined(LZO1Z)
t = ip[-1] & 3;
#else
t = ip[-2] & 3;
#endif
if (t == 0)
break;
/* copy literals */
match_next:
assert(t > 0);
assert(t < 4);
NEED_OP(t);
NEED_IP(t+1);
#if 0
do *op++ = *ip++; while (--t > 0);
#else
*op++ = *ip++;
if (t > 1) {
*op++ = *ip++;
if (t > 2)
*op++ = *ip++;
}
#endif
t = *ip++;
} while (TEST_IP && TEST_OP);
}
//#if defined(HAVE_TEST_IP) || defined(HAVE_TEST_OP)
/* no EOF code was found */
*out_len = pd(op, out);
return LZO_E_EOF_NOT_FOUND;
//#endif
eof_found:
assert(t == 1);
*out_len = pd(op, out);
return (ip == ip_end ? LZO_E_OK :
(ip < ip_end ? LZO_E_INPUT_NOT_CONSUMED : LZO_E_INPUT_OVERRUN));
//#if defined(HAVE_NEED_IP)
input_overrun:
*out_len = pd(op, out);
return LZO_E_INPUT_OVERRUN;
//#endif
//#if defined(HAVE_NEED_OP)
output_overrun:
*out_len = pd(op, out);
return LZO_E_OUTPUT_OVERRUN;
//#endif
//#if defined(LZO_TEST_OVERRUN_LOOKBEHIND)
lookbehind_overrun:
*out_len = pd(op, out);
return LZO_E_LOOKBEHIND_OVERRUN;
//#endif
}

View File

@ -0,0 +1,386 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC init_transformer_state(transformer_state_t *xstate)
{
memset(xstate, 0, sizeof(*xstate));
}
int FAST_FUNC check_signature16(transformer_state_t *xstate, unsigned magic16)
{
if (!xstate->signature_skipped) {
uint16_t magic2;
if (full_read(xstate->src_fd, &magic2, 2) != 2 || magic2 != magic16) {
bb_simple_error_msg("invalid magic");
return -1;
}
xstate->signature_skipped = 2;
}
return 0;
}
ssize_t FAST_FUNC transformer_write(transformer_state_t *xstate, const void *buf, size_t bufsize)
{
ssize_t nwrote;
if (xstate->mem_output_size_max != 0) {
size_t pos = xstate->mem_output_size;
size_t size;
size = (xstate->mem_output_size += bufsize);
if (size > xstate->mem_output_size_max) {
free(xstate->mem_output_buf);
xstate->mem_output_buf = NULL;
bb_perror_msg("buffer %u too small", (unsigned)xstate->mem_output_size_max);
nwrote = -1;
goto ret;
}
xstate->mem_output_buf = xrealloc(xstate->mem_output_buf, size + 1);
memcpy(xstate->mem_output_buf + pos, buf, bufsize);
xstate->mem_output_buf[size] = '\0';
nwrote = bufsize;
} else {
nwrote = full_write(xstate->dst_fd, buf, bufsize);
if (nwrote != (ssize_t)bufsize) {
bb_simple_perror_msg("write");
nwrote = -1;
goto ret;
}
}
ret:
return nwrote;
}
ssize_t FAST_FUNC xtransformer_write(transformer_state_t *xstate, const void *buf, size_t bufsize)
{
ssize_t nwrote = transformer_write(xstate, buf, bufsize);
if (nwrote != (ssize_t)bufsize) {
xfunc_die();
}
return nwrote;
}
void check_errors_in_children(int signo)
{
int status;
if (!signo) {
/* block waiting for any child */
if (wait(&status) < 0)
//FIXME: check EINTR?
return; /* probably there are no children */
goto check_status;
}
/* Wait for any child without blocking */
for (;;) {
if (wait_any_nohang(&status) < 0)
//FIXME: check EINTR?
/* wait failed?! I'm confused... */
return;
check_status:
/*if (WIFEXITED(status) && WEXITSTATUS(status) == 0)*/
/* On Linux, the above can be checked simply as: */
if (status == 0)
/* this child exited with 0 */
continue;
/* Cannot happen:
if (!WIFSIGNALED(status) && !WIFEXITED(status)) ???;
*/
bb_got_signal = 1;
}
}
/* transformer(), more than meets the eye */
#if BB_MMU
void FAST_FUNC fork_transformer(int fd,
int signature_skipped,
IF_DESKTOP(long long) int FAST_FUNC (*transformer)(transformer_state_t *xstate)
)
#else
void FAST_FUNC fork_transformer(int fd, const char *transform_prog)
#endif
{
struct fd_pair fd_pipe;
int pid;
xpiped_pair(fd_pipe);
pid = BB_MMU ? xfork() : xvfork();
if (pid == 0) {
/* Child */
close(fd_pipe.rd); /* we don't want to read from the parent */
// FIXME: error check?
#if BB_MMU
{
IF_DESKTOP(long long) int r;
transformer_state_t xstate;
init_transformer_state(&xstate);
xstate.signature_skipped = signature_skipped;
xstate.src_fd = fd;
xstate.dst_fd = fd_pipe.wr;
r = transformer(&xstate);
if (ENABLE_FEATURE_CLEAN_UP) {
close(fd_pipe.wr); /* send EOF */
close(fd);
}
/* must be _exit! bug was actually seen here */
_exit(/*error if:*/ r < 0);
}
#else
{
char *argv[4];
xmove_fd(fd, 0);
xmove_fd(fd_pipe.wr, 1);
argv[0] = (char*)transform_prog;
argv[1] = (char*)"-cf";
argv[2] = (char*)"-";
argv[3] = NULL;
BB_EXECVP(transform_prog, argv);
bb_perror_msg_and_die("can't execute '%s'", transform_prog);
}
#endif
/* notreached */
}
/* parent process */
close(fd_pipe.wr); /* don't want to write to the child */
xmove_fd(fd_pipe.rd, fd);
}
#if SEAMLESS_COMPRESSION
/* Used by e.g. rpm which gives us a fd without filename,
* thus we can't guess the format from filename's extension.
*/
static transformer_state_t *setup_transformer_on_fd(int fd, int fail_if_not_compressed)
{
transformer_state_t *xstate;
xstate = xzalloc(sizeof(*xstate));
xstate->src_fd = fd;
/* .gz and .bz2 both have 2-byte signature, and their
* unpack_XXX_stream wants this header skipped. */
xstate->signature_skipped = 2;
xread(fd, xstate->magic.b16, 2);
if (ENABLE_FEATURE_SEAMLESS_GZ
&& xstate->magic.b16[0] == GZIP_MAGIC
) {
xstate->xformer = unpack_gz_stream;
USE_FOR_NOMMU(xstate->xformer_prog = "gunzip";)
goto found_magic;
}
if (ENABLE_FEATURE_SEAMLESS_Z
&& xstate->magic.b16[0] == COMPRESS_MAGIC
) {
xstate->xformer = unpack_Z_stream;
USE_FOR_NOMMU(xstate->xformer_prog = "uncompress";)
goto found_magic;
}
if (ENABLE_FEATURE_SEAMLESS_BZ2
&& xstate->magic.b16[0] == BZIP2_MAGIC
) {
xstate->xformer = unpack_bz2_stream;
USE_FOR_NOMMU(xstate->xformer_prog = "bunzip2";)
goto found_magic;
}
if (ENABLE_FEATURE_SEAMLESS_XZ
&& xstate->magic.b16[0] == XZ_MAGIC1
) {
uint32_t v32;
xstate->signature_skipped = 6;
xread(fd, &xstate->magic.b16[1], 4);
move_from_unaligned32(v32, &xstate->magic.b16[1]);
if (v32 == XZ_MAGIC2) {
xstate->xformer = unpack_xz_stream;
USE_FOR_NOMMU(xstate->xformer_prog = "unxz";)
goto found_magic;
}
}
/* No known magic seen */
if (fail_if_not_compressed)
bb_simple_error_msg_and_die("no gzip"
IF_FEATURE_SEAMLESS_BZ2("/bzip2")
IF_FEATURE_SEAMLESS_XZ("/xz")
" magic");
/* Some callers expect this function to "consume" fd
* even if data is not compressed. In this case,
* we return a state with trivial transformer.
*/
// USE_FOR_MMU(xstate->xformer = copy_stream;)
// USE_FOR_NOMMU(xstate->xformer_prog = "cat";)
found_magic:
return xstate;
}
static void fork_transformer_and_free(transformer_state_t *xstate)
{
# if BB_MMU
fork_transformer_with_no_sig(xstate->src_fd, xstate->xformer);
# else
/* NOMMU version of fork_transformer execs
* an external unzipper that wants
* file position at the start of the file.
*/
xlseek(xstate->src_fd, - xstate->signature_skipped, SEEK_CUR);
xstate->signature_skipped = 0;
fork_transformer_with_sig(xstate->src_fd, xstate->xformer, xstate->xformer_prog);
# endif
free(xstate);
}
/* Used by e.g. rpm which gives us a fd without filename,
* thus we can't guess the format from filename's extension.
*/
int FAST_FUNC setup_unzip_on_fd(int fd, int fail_if_not_compressed)
{
transformer_state_t *xstate = setup_transformer_on_fd(fd, fail_if_not_compressed);
if (!xstate->xformer) {
free(xstate);
return 1;
}
fork_transformer_and_free(xstate);
return 0;
}
#if ENABLE_FEATURE_SEAMLESS_LZMA
/* ...and custom version for LZMA */
void FAST_FUNC setup_lzma_on_fd(int fd)
{
transformer_state_t *xstate = xzalloc(sizeof(*xstate));
xstate->src_fd = fd;
xstate->xformer = unpack_lzma_stream;
USE_FOR_NOMMU(xstate->xformer_prog = "unlzma";)
fork_transformer_and_free(xstate);
}
#endif
static transformer_state_t *open_transformer(const char *fname, int fail_if_not_compressed)
{
transformer_state_t *xstate;
int fd;
fd = open(fname, O_RDONLY);
if (fd < 0)
return NULL;
if (ENABLE_FEATURE_SEAMLESS_LZMA) {
/* .lzma has no header/signature, can only detect it by extension */
if (is_suffixed_with(fname, ".lzma")) {
xstate = xzalloc(sizeof(*xstate));
xstate->src_fd = fd;
xstate->xformer = unpack_lzma_stream;
USE_FOR_NOMMU(xstate->xformer_prog = "unlzma";)
return xstate;
}
}
xstate = setup_transformer_on_fd(fd, fail_if_not_compressed);
return xstate;
}
int FAST_FUNC open_zipped(const char *fname, int fail_if_not_compressed)
{
int fd;
transformer_state_t *xstate;
xstate = open_transformer(fname, fail_if_not_compressed);
if (!xstate)
return -1;
fd = xstate->src_fd;
# if BB_MMU
if (xstate->xformer) {
fork_transformer_with_no_sig(fd, xstate->xformer);
} else {
/* the file is not compressed */
xlseek(fd, - xstate->signature_skipped, SEEK_CUR);
xstate->signature_skipped = 0;
}
# else
/* NOMMU can't avoid the seek :( */
xlseek(fd, - xstate->signature_skipped, SEEK_CUR);
xstate->signature_skipped = 0;
if (xstate->xformer) {
fork_transformer_with_sig(fd, xstate->xformer, xstate->xformer_prog);
} /* else: the file is not compressed */
# endif
free(xstate);
return fd;
}
void* FAST_FUNC xmalloc_open_zipped_read_close(const char *fname, size_t *maxsz_p)
{
# if 1
transformer_state_t *xstate;
char *image;
xstate = open_transformer(fname, /*fail_if_not_compressed:*/ 0);
if (!xstate) /* file open error */
return NULL;
image = NULL;
if (xstate->xformer) {
/* In-memory decompression */
xstate->mem_output_size_max = maxsz_p ? *maxsz_p : (size_t)(INT_MAX - 4095);
xstate->xformer(xstate);
if (xstate->mem_output_buf) {
image = xstate->mem_output_buf;
if (maxsz_p)
*maxsz_p = xstate->mem_output_size;
}
} else {
/* File is not compressed.
* We already read first few bytes, account for that.
* Example where it happens:
* "modinfo MODULE.ko" (not compressed)
* open("MODULE.ko", O_RDONLY|O_LARGEFILE) = 4
* read(4, "\177E", 2) = 2
* fstat64(4, ...)
* mmap(...)
* read(4, "LF\2\1\1\0\0\0\0"...
* ...and we avoided seeking on the fd! :)
*/
image = xmalloc_read_with_initial_buf(
xstate->src_fd,
maxsz_p,
xmemdup(&xstate->magic, xstate->signature_skipped),
xstate->signature_skipped
);
xstate->signature_skipped = 0;
}
if (!image)
bb_perror_msg("read error from '%s'", fname);
close(xstate->src_fd);
free(xstate);
return image;
# else
/* This version forks a subprocess - much more expensive */
int fd;
char *image;
fd = open_zipped(fname, /*fail_if_not_compressed:*/ 0);
if (fd < 0)
return NULL;
image = xmalloc_read(fd, maxsz_p);
if (!image)
bb_perror_msg("read error from '%s'", fname);
close(fd);
return image;
# endif
}
#endif /* SEAMLESS_COMPRESSION */

View File

@ -0,0 +1,18 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC seek_by_jump(int fd, off_t amount)
{
if (amount
&& lseek(fd, amount, SEEK_CUR) == (off_t) -1
) {
if (errno == ESPIPE)
seek_by_read(fd, amount);
else
bb_simple_perror_msg_and_die("seek failure");
}
}

View File

@ -0,0 +1,15 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
/* If we are reading through a pipe, or from stdin then we can't lseek,
* we must read and discard the data to skip over it.
*/
void FAST_FUNC seek_by_read(int fd, off_t amount)
{
if (amount)
bb_copyfd_exact_size(fd, -1, amount);
}

View File

@ -0,0 +1,21 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
#include "ar_.h"
void FAST_FUNC unpack_ar_archive(archive_handle_t *ar_archive)
{
char magic[7];
xread(ar_archive->src_fd, magic, AR_MAGIC_LEN);
if (!is_prefixed_with(magic, AR_MAGIC)) {
bb_simple_error_msg_and_die("invalid ar magic");
}
ar_archive->offset += AR_MAGIC_LEN;
while (get_header_ar(ar_archive) == EXIT_SUCCESS)
continue;
}

View File

@ -0,0 +1,35 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
const char* FAST_FUNC strip_unsafe_prefix(const char *str)
{
const char *cp = str;
while (1) {
char *cp2;
if (*cp == '/') {
cp++;
continue;
}
if (is_prefixed_with(cp, "/../"+1)) {
cp += 3;
continue;
}
cp2 = strstr(cp, "/../");
if (!cp2)
break;
cp = cp2 + 4;
}
if (cp != str) {
static smallint warned = 0;
if (!warned) {
warned = 1;
bb_error_msg("removing leading '%.*s' from member names",
(int)(cp - str), str);
}
}
return cp;
}

View File

@ -0,0 +1,42 @@
/* vi: set sw=4 ts=4: */
/*
* Licensed under GPLv2 or later, see file LICENSE in this source tree.
*/
#include "libbb.h"
#include "bb_archive.h"
void FAST_FUNC create_or_remember_link(llist_t **link_placeholders,
const char *target,
const char *linkname,
int hard_link)
{
if (hard_link || target[0] == '/' || strstr(target, "..")) {
llist_add_to_end(link_placeholders,
xasprintf("%c%s%c%s", hard_link, linkname, '\0', target)
);
return;
}
if (symlink(target, linkname) != 0) {
/* shared message */
bb_perror_msg_and_die("can't create %slink '%s' to '%s'",
"sym", linkname, target
);
}
}
void FAST_FUNC create_links_from_list(llist_t *list)
{
while (list) {
char *target;
target = list->data + 1 + strlen(list->data + 1) + 1;
if ((*list->data ? link : symlink) (target, list->data + 1)) {
/* shared message */
bb_error_msg_and_die("can't create %slink '%s' to '%s'",
*list->data ? "hard" : "sym",
list->data + 1, target
);
}
list = list->link;
}
}

View File

@ -0,0 +1,135 @@
XZ Embedded
===========
XZ Embedded is a relatively small, limited implementation of the .xz
file format. Currently only decoding is implemented.
XZ Embedded was written for use in the Linux kernel, but the code can
be easily used in other environments too, including regular userspace
applications. See userspace/xzminidec.c for an example program.
This README contains information that is useful only when the copy
of XZ Embedded isn't part of the Linux kernel tree. You should also
read linux/Documentation/xz.txt even if you aren't using XZ Embedded
as part of Linux; information in that file is not repeated in this
README.
Compiling the Linux kernel module
The xz_dec module depends on crc32 module, so make sure that you have
it enabled (CONFIG_CRC32).
Building the xz_dec and xz_dec_test modules without support for BCJ
filters:
cd linux/lib/xz
make -C /path/to/kernel/source \
KCPPFLAGS=-I"$(pwd)/../../include" M="$(pwd)" \
CONFIG_XZ_DEC=m CONFIG_XZ_DEC_TEST=m
Building the xz_dec and xz_dec_test modules with support for BCJ
filters:
cd linux/lib/xz
make -C /path/to/kernel/source \
KCPPFLAGS=-I"$(pwd)/../../include" M="$(pwd)" \
CONFIG_XZ_DEC=m CONFIG_XZ_DEC_TEST=m CONFIG_XZ_DEC_BCJ=y \
CONFIG_XZ_DEC_X86=y CONFIG_XZ_DEC_POWERPC=y \
CONFIG_XZ_DEC_IA64=y CONFIG_XZ_DEC_ARM=y \
CONFIG_XZ_DEC_ARMTHUMB=y CONFIG_XZ_DEC_SPARC=y
If you want only one or a few of the BCJ filters, omit the appropriate
variables. CONFIG_XZ_DEC_BCJ=y is always required to build the support
code shared between all BCJ filters.
Most people don't need the xz_dec_test module. You can skip building
it by omitting CONFIG_XZ_DEC_TEST=m from the make command line.
Compiler requirements
XZ Embedded should compile as either GNU-C89 (used in the Linux
kernel) or with any C99 compiler. Getting the code to compile with
non-GNU C89 compiler or a C++ compiler should be quite easy as
long as there is a data type for unsigned 64-bit integer (or the
code is modified not to support large files, which needs some more
care than just using 32-bit integer instead of 64-bit).
If you use GCC, try to use a recent version. For example, on x86-32,
xz_dec_lzma2.c compiled with GCC 3.3.6 is 15-25 % slower than when
compiled with GCC 4.3.3.
Embedding into userspace applications
To embed the XZ decoder, copy the following files into a single
directory in your source code tree:
linux/include/linux/xz.h
linux/lib/xz/xz_crc32.c
linux/lib/xz/xz_dec_lzma2.c
linux/lib/xz/xz_dec_stream.c
linux/lib/xz/xz_lzma2.h
linux/lib/xz/xz_private.h
linux/lib/xz/xz_stream.h
userspace/xz_config.h
Alternatively, xz.h may be placed into a different directory but then
that directory must be in the compiler include path when compiling
the .c files.
Your code should use only the functions declared in xz.h. The rest of
the .h files are meant only for internal use in XZ Embedded.
You may want to modify xz_config.h to be more suitable for your build
environment. Probably you should at least skim through it even if the
default file works as is.
BCJ filter support
If you want support for one or more BCJ filters, you need to copy also
linux/lib/xz/xz_dec_bcj.c into your application, and use appropriate
#defines in xz_config.h or in compiler flags. You don't need these
#defines in the code that just uses XZ Embedded via xz.h, but having
them always #defined doesn't hurt either.
#define Instruction set BCJ filter endianness
XZ_DEC_X86 x86-32 or x86-64 Little endian only
XZ_DEC_POWERPC PowerPC Big endian only
XZ_DEC_IA64 Itanium (IA-64) Big or little endian
XZ_DEC_ARM ARM Little endian only
XZ_DEC_ARMTHUMB ARM-Thumb Little endian only
XZ_DEC_SPARC SPARC Big or little endian
While some architectures are (partially) bi-endian, the endianness
setting doesn't change the endianness of the instructions on all
architectures. That's why Itanium and SPARC filters work for both big
and little endian executables (Itanium has little endian instructions
and SPARC has big endian instructions).
There currently is no filter for little endian PowerPC or big endian
ARM or ARM-Thumb. Implementing filters for them can be considered if
there is a need for such filters in real-world applications.
Notes about shared libraries
If you are including XZ Embedded into a shared library, you very
probably should rename the xz_* functions to prevent symbol
conflicts in case your library is linked against some other library
or application that also has XZ Embedded in it (which may even be
a different version of XZ Embedded). TODO: Provide an easy way
to do this.
Please don't create a shared library of XZ Embedded itself unless
it is fine to rebuild everything depending on that shared library
everytime you upgrade to a newer version of XZ Embedded. There are
no API or ABI stability guarantees between different versions of
XZ Embedded.
Specifying the calling convention
XZ_FUNC macro was included to support declaring functions with __init
in Linux. Outside Linux, it can be used to specify the calling
convention on systems that support multiple calling conventions.
For example, on Windows, you may make all functions use the stdcall
calling convention by defining XZ_FUNC=__stdcall when building and
using the functions from XZ Embedded.

View File

@ -0,0 +1,280 @@
/*
* XZ decompressor
*
* Authors: Lasse Collin <lasse.collin@tukaani.org>
* Igor Pavlov <http://7-zip.org/>
*
* This file has been put into the public domain.
* You can do whatever you want with this file.
*/
#ifndef XZ_H
#define XZ_H
#ifdef __KERNEL__
# include <linux/stddef.h>
# include <linux/types.h>
#else
# include <stddef.h>
# include <stdint.h>
#endif
#ifdef __cplusplus
extern "C" {
#endif
/* In Linux, this is used to make extern functions static when needed. */
#ifndef XZ_EXTERN
# define XZ_EXTERN extern
#endif
/* In Linux, this is used to mark the functions with __init when needed. */
#ifndef XZ_FUNC
# define XZ_FUNC
#endif
/**
* enum xz_mode - Operation mode
*
* @XZ_SINGLE: Single-call mode. This uses less RAM than
* than multi-call modes, because the LZMA2
* dictionary doesn't need to be allocated as
* part of the decoder state. All required data
* structures are allocated at initialization,
* so xz_dec_run() cannot return XZ_MEM_ERROR.
* @XZ_PREALLOC: Multi-call mode with preallocated LZMA2
* dictionary buffer. All data structures are
* allocated at initialization, so xz_dec_run()
* cannot return XZ_MEM_ERROR.
* @XZ_DYNALLOC: Multi-call mode. The LZMA2 dictionary is
* allocated once the required size has been
* parsed from the stream headers. If the
* allocation fails, xz_dec_run() will return
* XZ_MEM_ERROR.
*
* It is possible to enable support only for a subset of the above
* modes at compile time by defining XZ_DEC_SINGLE, XZ_DEC_PREALLOC,
* or XZ_DEC_DYNALLOC. The xz_dec kernel module is always compiled
* with support for all operation modes, but the preboot code may
* be built with fewer features to minimize code size.
*/
enum xz_mode {
XZ_SINGLE,
XZ_PREALLOC,
XZ_DYNALLOC
};
/**
* enum xz_ret - Return codes
* @XZ_OK: Everything is OK so far. More input or more
* output space is required to continue. This
* return code is possible only in multi-call mode
* (XZ_PREALLOC or XZ_DYNALLOC).
* @XZ_STREAM_END: Operation finished successfully.
* @XZ_UNSUPPORTED_CHECK: Integrity check type is not supported. Decoding
* is still possible in multi-call mode by simply
* calling xz_dec_run() again.
* Note that this return value is used only if
* XZ_DEC_ANY_CHECK was defined at build time,
* which is not used in the kernel. Unsupported
* check types return XZ_OPTIONS_ERROR if
* XZ_DEC_ANY_CHECK was not defined at build time.
* @XZ_MEM_ERROR: Allocating memory failed. This return code is
* possible only if the decoder was initialized
* with XZ_DYNALLOC. The amount of memory that was
* tried to be allocated was no more than the
* dict_max argument given to xz_dec_init().
* @XZ_MEMLIMIT_ERROR: A bigger LZMA2 dictionary would be needed than
* allowed by the dict_max argument given to
* xz_dec_init(). This return value is possible
* only in multi-call mode (XZ_PREALLOC or
* XZ_DYNALLOC); the single-call mode (XZ_SINGLE)
* ignores the dict_max argument.
* @XZ_FORMAT_ERROR: File format was not recognized (wrong magic
* bytes).
* @XZ_OPTIONS_ERROR: This implementation doesn't support the requested
* compression options. In the decoder this means
* that the header CRC32 matches, but the header
* itself specifies something that we don't support.
* @XZ_DATA_ERROR: Compressed data is corrupt.
* @XZ_BUF_ERROR: Cannot make any progress. Details are slightly
* different between multi-call and single-call
* mode; more information below.
*
* In multi-call mode, XZ_BUF_ERROR is returned when two consecutive calls
* to XZ code cannot consume any input and cannot produce any new output.
* This happens when there is no new input available, or the output buffer
* is full while at least one output byte is still pending. Assuming your
* code is not buggy, you can get this error only when decoding a compressed
* stream that is truncated or otherwise corrupt.
*
* In single-call mode, XZ_BUF_ERROR is returned only when the output buffer
* is too small or the compressed input is corrupt in a way that makes the
* decoder produce more output than the caller expected. When it is
* (relatively) clear that the compressed input is truncated, XZ_DATA_ERROR
* is used instead of XZ_BUF_ERROR.
*/
enum xz_ret {
XZ_OK,
XZ_STREAM_END,
XZ_UNSUPPORTED_CHECK,
XZ_MEM_ERROR,
XZ_MEMLIMIT_ERROR,
XZ_FORMAT_ERROR,
XZ_OPTIONS_ERROR,
XZ_DATA_ERROR,
XZ_BUF_ERROR
};
/**
* struct xz_buf - Passing input and output buffers to XZ code
* @in: Beginning of the input buffer. This may be NULL if and only
* if in_pos is equal to in_size.
* @in_pos: Current position in the input buffer. This must not exceed
* in_size.
* @in_size: Size of the input buffer
* @out: Beginning of the output buffer. This may be NULL if and only
* if out_pos is equal to out_size.
* @out_pos: Current position in the output buffer. This must not exceed
* out_size.
* @out_size: Size of the output buffer
*
* Only the contents of the output buffer from out[out_pos] onward, and
* the variables in_pos and out_pos are modified by the XZ code.
*/
struct xz_buf {
const uint8_t *in;
size_t in_pos;
size_t in_size;
uint8_t *out;
size_t out_pos;
size_t out_size;
};
/**
* struct xz_dec - Opaque type to hold the XZ decoder state
*/
struct xz_dec;
/**
* xz_dec_init() - Allocate and initialize a XZ decoder state
* @mode: Operation mode
* @dict_max: Maximum size of the LZMA2 dictionary (history buffer) for
* multi-call decoding. This is ignored in single-call mode
* (mode == XZ_SINGLE). LZMA2 dictionary is always 2^n bytes
* or 2^n + 2^(n-1) bytes (the latter sizes are less common
* in practice), so other values for dict_max don't make sense.
* In the kernel, dictionary sizes of 64 KiB, 128 KiB, 256 KiB,
* 512 KiB, and 1 MiB are probably the only reasonable values,
* except for kernel and initramfs images where a bigger
* dictionary can be fine and useful.
*
* Single-call mode (XZ_SINGLE): xz_dec_run() decodes the whole stream at
* once. The caller must provide enough output space or the decoding will
* fail. The output space is used as the dictionary buffer, which is why
* there is no need to allocate the dictionary as part of the decoder's
* internal state.
*
* Because the output buffer is used as the workspace, streams encoded using
* a big dictionary are not a problem in single-call mode. It is enough that
* the output buffer is big enough to hold the actual uncompressed data; it
* can be smaller than the dictionary size stored in the stream headers.
*
* Multi-call mode with preallocated dictionary (XZ_PREALLOC): dict_max bytes
* of memory is preallocated for the LZMA2 dictionary. This way there is no
* risk that xz_dec_run() could run out of memory, since xz_dec_run() will
* never allocate any memory. Instead, if the preallocated dictionary is too
* small for decoding the given input stream, xz_dec_run() will return
* XZ_MEMLIMIT_ERROR. Thus, it is important to know what kind of data will be
* decoded to avoid allocating excessive amount of memory for the dictionary.
*
* Multi-call mode with dynamically allocated dictionary (XZ_DYNALLOC):
* dict_max specifies the maximum allowed dictionary size that xz_dec_run()
* may allocate once it has parsed the dictionary size from the stream
* headers. This way excessive allocations can be avoided while still
* limiting the maximum memory usage to a sane value to prevent running the
* system out of memory when decompressing streams from untrusted sources.
*
* On success, xz_dec_init() returns a pointer to struct xz_dec, which is
* ready to be used with xz_dec_run(). If memory allocation fails,
* xz_dec_init() returns NULL.
*/
XZ_EXTERN struct xz_dec * XZ_FUNC xz_dec_init(
enum xz_mode mode, uint32_t dict_max);
/**
* xz_dec_run() - Run the XZ decoder
* @s: Decoder state allocated using xz_dec_init()
* @b: Input and output buffers
*
* The possible return values depend on build options and operation mode.
* See enum xz_ret for details.
*
* Note that if an error occurs in single-call mode (return value is not
* XZ_STREAM_END), b->in_pos and b->out_pos are not modified and the
* contents of the output buffer from b->out[b->out_pos] onward are
* undefined. This is true even after XZ_BUF_ERROR, because with some filter
* chains, there may be a second pass over the output buffer, and this pass
* cannot be properly done if the output buffer is truncated. Thus, you
* cannot give the single-call decoder a too small buffer and then expect to
* get that amount valid data from the beginning of the stream. You must use
* the multi-call decoder if you don't want to uncompress the whole stream.
*/
XZ_EXTERN enum xz_ret XZ_FUNC xz_dec_run(struct xz_dec *s, struct xz_buf *b);
/**
* xz_dec_reset() - Reset an already allocated decoder state
* @s: Decoder state allocated using xz_dec_init()
*
* This function can be used to reset the multi-call decoder state without
* freeing and reallocating memory with xz_dec_end() and xz_dec_init().
*
* In single-call mode, xz_dec_reset() is always called in the beginning of
* xz_dec_run(). Thus, explicit call to xz_dec_reset() is useful only in
* multi-call mode.
*/
XZ_EXTERN void XZ_FUNC xz_dec_reset(struct xz_dec *s);
/**
* xz_dec_end() - Free the memory allocated for the decoder state
* @s: Decoder state allocated using xz_dec_init(). If s is NULL,
* this function does nothing.
*/
XZ_EXTERN void XZ_FUNC xz_dec_end(struct xz_dec *s);
/*
* Standalone build (userspace build or in-kernel build for boot time use)
* needs a CRC32 implementation. For normal in-kernel use, kernel's own
* CRC32 module is used instead, and users of this module don't need to
* care about the functions below.
*/
#ifndef XZ_INTERNAL_CRC32
# ifdef __KERNEL__
# define XZ_INTERNAL_CRC32 0
# else
# define XZ_INTERNAL_CRC32 1
# endif
#endif
#if XZ_INTERNAL_CRC32
/*
* This must be called before any other xz_* function to initialize
* the CRC32 lookup table.
*/
XZ_EXTERN void XZ_FUNC xz_crc32_init(void);
/*
* Update CRC32 value using the polynomial from IEEE-802.3. To start a new
* calculation, the third argument must be zero. To continue the calculation,
* the previously returned value is passed as the third argument.
*/
XZ_EXTERN uint32_t XZ_FUNC xz_crc32(
const uint8_t *buf, size_t size, uint32_t crc);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@ -0,0 +1,123 @@
/*
* Private includes and definitions for userspace use of XZ Embedded
*
* Author: Lasse Collin <lasse.collin@tukaani.org>
*
* This file has been put into the public domain.
* You can do whatever you want with this file.
*/
#ifndef XZ_CONFIG_H
#define XZ_CONFIG_H
/* Uncomment as needed to enable BCJ filter decoders. */
/* #define XZ_DEC_X86 */
/* #define XZ_DEC_POWERPC */
/* #define XZ_DEC_IA64 */
/* #define XZ_DEC_ARM */
/* #define XZ_DEC_ARMTHUMB */
/* #define XZ_DEC_SPARC */
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include "xz.h"
#define kmalloc(size, flags) malloc(size)
#define kfree(ptr) free(ptr)
#define vmalloc(size) malloc(size)
#define vfree(ptr) free(ptr)
#define memeq(a, b, size) (memcmp(a, b, size) == 0)
#define memzero(buf, size) memset(buf, 0, size)
#undef min
#undef min_t
#define min(x, y) ((x) < (y) ? (x) : (y))
#define min_t(type, x, y) min(x, y)
/*
* Some functions have been marked with __always_inline to keep the
* performance reasonable even when the compiler is optimizing for
* small code size. You may be able to save a few bytes by #defining
* __always_inline to plain inline, but don't complain if the code
* becomes slow.
*
* NOTE: System headers on GNU/Linux may #define this macro already,
* so if you want to change it, you need to #undef it first.
*/
#ifndef __always_inline
# ifdef __GNUC__
# define __always_inline \
inline __attribute__((__always_inline__))
# else
# define __always_inline inline
# endif
#endif
/*
* Some functions are marked to never be inlined to reduce stack usage.
* If you don't care about stack usage, you may want to modify this so
* that noinline_for_stack is #defined to be empty even when using GCC.
* Doing so may save a few bytes in binary size.
*/
#ifndef noinline_for_stack
# ifdef __GNUC__
# define noinline_for_stack __attribute__((__noinline__))
# else
# define noinline_for_stack
# endif
#endif
/* Inline functions to access unaligned unsigned 32-bit integers */
#ifndef get_unaligned_le32
static inline uint32_t XZ_FUNC get_unaligned_le32(const uint8_t *buf)
{
return (uint32_t)buf[0]
| ((uint32_t)buf[1] << 8)
| ((uint32_t)buf[2] << 16)
| ((uint32_t)buf[3] << 24);
}
#endif
#ifndef get_unaligned_be32
static inline uint32_t XZ_FUNC get_unaligned_be32(const uint8_t *buf)
{
return (uint32_t)(buf[0] << 24)
| ((uint32_t)buf[1] << 16)
| ((uint32_t)buf[2] << 8)
| (uint32_t)buf[3];
}
#endif
#ifndef put_unaligned_le32
static inline void XZ_FUNC put_unaligned_le32(uint32_t val, uint8_t *buf)
{
buf[0] = (uint8_t)val;
buf[1] = (uint8_t)(val >> 8);
buf[2] = (uint8_t)(val >> 16);
buf[3] = (uint8_t)(val >> 24);
}
#endif
#ifndef put_unaligned_be32
static inline void XZ_FUNC put_unaligned_be32(uint32_t val, uint8_t *buf)
{
buf[0] = (uint8_t)(val >> 24);
buf[1] = (uint8_t)(val >> 16);
buf[2] = (uint8_t)(val >> 8);
buf[3] = (uint8_t)val;
}
#endif
/*
* Use get_unaligned_le32() also for aligned access for simplicity. On
* little endian systems, #define get_le32(ptr) (*(const uint32_t *)(ptr))
* could save a few bytes in code size.
*/
#ifndef get_le32
# define get_le32 get_unaligned_le32
#endif
#endif

View File

@ -0,0 +1,580 @@
/*
* Branch/Call/Jump (BCJ) filter decoders
*
* Authors: Lasse Collin <lasse.collin@tukaani.org>
* Igor Pavlov <http://7-zip.org/>
*
* This file has been put into the public domain.
* You can do whatever you want with this file.
*/
#include "xz_private.h"
/*
* The rest of the file is inside this ifdef. It makes things a little more
* convenient when building without support for any BCJ filters.
*/
#ifdef XZ_DEC_BCJ
struct xz_dec_bcj {
/* Type of the BCJ filter being used */
enum {
BCJ_X86 = 4, /* x86 or x86-64 */
BCJ_POWERPC = 5, /* Big endian only */
BCJ_IA64 = 6, /* Big or little endian */
BCJ_ARM = 7, /* Little endian only */
BCJ_ARMTHUMB = 8, /* Little endian only */
BCJ_SPARC = 9 /* Big or little endian */
} type;
/*
* Return value of the next filter in the chain. We need to preserve
* this information across calls, because we must not call the next
* filter anymore once it has returned XZ_STREAM_END.
*/
enum xz_ret ret;
/* True if we are operating in single-call mode. */
bool single_call;
/*
* Absolute position relative to the beginning of the uncompressed
* data (in a single .xz Block). We care only about the lowest 32
* bits so this doesn't need to be uint64_t even with big files.
*/
uint32_t pos;
/* x86 filter state */
uint32_t x86_prev_mask;
/* Temporary space to hold the variables from struct xz_buf */
uint8_t *out;
size_t out_pos;
size_t out_size;
struct {
/* Amount of already filtered data in the beginning of buf */
size_t filtered;
/* Total amount of data currently stored in buf */
size_t size;
/*
* Buffer to hold a mix of filtered and unfiltered data. This
* needs to be big enough to hold Alignment + 2 * Look-ahead:
*
* Type Alignment Look-ahead
* x86 1 4
* PowerPC 4 0
* IA-64 16 0
* ARM 4 0
* ARM-Thumb 2 2
* SPARC 4 0
*/
uint8_t buf[16];
} temp;
};
#ifdef XZ_DEC_X86
/*
* This is used to test the most significant byte of a memory address
* in an x86 instruction.
*/
static inline int bcj_x86_test_msbyte(uint8_t b)
{
return b == 0x00 || b == 0xFF;
}
static noinline_for_stack size_t XZ_FUNC bcj_x86(
struct xz_dec_bcj *s, uint8_t *buf, size_t size)
{
static const bool mask_to_allowed_status[8]
= { true, true, true, false, true, false, false, false };
static const uint8_t mask_to_bit_num[8] = { 0, 1, 2, 2, 3, 3, 3, 3 };
size_t i;
size_t prev_pos = (size_t)-1;
uint32_t prev_mask = s->x86_prev_mask;
uint32_t src;
uint32_t dest;
uint32_t j;
uint8_t b;
if (size <= 4)
return 0;
size -= 4;
for (i = 0; i < size; ++i) {
if ((buf[i] & 0xFE) != 0xE8)
continue;
prev_pos = i - prev_pos;
if (prev_pos > 3) {
prev_mask = 0;
} else {
prev_mask = (prev_mask << (prev_pos - 1)) & 7;
if (prev_mask != 0) {
b = buf[i + 4 - mask_to_bit_num[prev_mask]];
if (!mask_to_allowed_status[prev_mask]
|| bcj_x86_test_msbyte(b)) {
prev_pos = i;
prev_mask = (prev_mask << 1) | 1;
continue;
}
}
}
prev_pos = i;
if (bcj_x86_test_msbyte(buf[i + 4])) {
src = get_unaligned_le32(buf + i + 1);
while (true) {
dest = src - (s->pos + (uint32_t)i + 5);
if (prev_mask == 0)
break;
j = mask_to_bit_num[prev_mask] * 8;
b = (uint8_t)(dest >> (24 - j));
if (!bcj_x86_test_msbyte(b))
break;
src = dest ^ (((uint32_t)1 << (32 - j)) - 1);
}
dest &= 0x01FFFFFF;
dest |= (uint32_t)0 - (dest & 0x01000000);
put_unaligned_le32(dest, buf + i + 1);
i += 4;
} else {
prev_mask = (prev_mask << 1) | 1;
}
}
prev_pos = i - prev_pos;
s->x86_prev_mask = prev_pos > 3 ? 0 : prev_mask << (prev_pos - 1);
return i;
}
#endif
#ifdef XZ_DEC_POWERPC
static noinline_for_stack size_t XZ_FUNC bcj_powerpc(
struct xz_dec_bcj *s, uint8_t *buf, size_t size)
{
size_t i;
uint32_t instr;
for (i = 0; i + 4 <= size; i += 4) {
instr = get_unaligned_be32(buf + i);
if ((instr & 0xFC000003) == 0x48000001) {
instr &= 0x03FFFFFC;
instr -= s->pos + (uint32_t)i;
instr &= 0x03FFFFFC;
instr |= 0x48000001;
put_unaligned_be32(instr, buf + i);
}
}
return i;
}
#endif
#ifdef XZ_DEC_IA64
static noinline_for_stack size_t XZ_FUNC bcj_ia64(
struct xz_dec_bcj *s, uint8_t *buf, size_t size)
{
static const uint8_t branch_table[32] = {
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
4, 4, 6, 6, 0, 0, 7, 7,
4, 4, 0, 0, 4, 4, 0, 0
};
/*
* The local variables take a little bit stack space, but it's less
* than what LZMA2 decoder takes, so it doesn't make sense to reduce
* stack usage here without doing that for the LZMA2 decoder too.
*/
/* Loop counters */
size_t i;
size_t j;
/* Instruction slot (0, 1, or 2) in the 128-bit instruction word */
uint32_t slot;
/* Bitwise offset of the instruction indicated by slot */
uint32_t bit_pos;
/* bit_pos split into byte and bit parts */
uint32_t byte_pos;
uint32_t bit_res;
/* Address part of an instruction */
uint32_t addr;
/* Mask used to detect which instructions to convert */
uint32_t mask;
/* 41-bit instruction stored somewhere in the lowest 48 bits */
uint64_t instr;
/* Instruction normalized with bit_res for easier manipulation */
uint64_t norm;
for (i = 0; i + 16 <= size; i += 16) {
mask = branch_table[buf[i] & 0x1F];
for (slot = 0, bit_pos = 5; slot < 3; ++slot, bit_pos += 41) {
if (((mask >> slot) & 1) == 0)
continue;
byte_pos = bit_pos >> 3;
bit_res = bit_pos & 7;
instr = 0;
for (j = 0; j < 6; ++j)
instr |= (uint64_t)(buf[i + j + byte_pos])
<< (8 * j);
norm = instr >> bit_res;
if (((norm >> 37) & 0x0F) == 0x05
&& ((norm >> 9) & 0x07) == 0) {
addr = (norm >> 13) & 0x0FFFFF;
addr |= ((uint32_t)(norm >> 36) & 1) << 20;
addr <<= 4;
addr -= s->pos + (uint32_t)i;
addr >>= 4;
norm &= ~((uint64_t)0x8FFFFF << 13);
norm |= (uint64_t)(addr & 0x0FFFFF) << 13;
norm |= (uint64_t)(addr & 0x100000)
<< (36 - 20);
instr &= (1 << bit_res) - 1;
instr |= norm << bit_res;
for (j = 0; j < 6; j++)
buf[i + j + byte_pos]
= (uint8_t)(instr >> (8 * j));
}
}
}
return i;
}
#endif
#ifdef XZ_DEC_ARM
static noinline_for_stack size_t XZ_FUNC bcj_arm(
struct xz_dec_bcj *s, uint8_t *buf, size_t size)
{
size_t i;
uint32_t addr;
for (i = 0; i + 4 <= size; i += 4) {
if (buf[i + 3] == 0xEB) {
addr = (uint32_t)buf[i] | ((uint32_t)buf[i + 1] << 8)
| ((uint32_t)buf[i + 2] << 16);
addr <<= 2;
addr -= s->pos + (uint32_t)i + 8;
addr >>= 2;
buf[i] = (uint8_t)addr;
buf[i + 1] = (uint8_t)(addr >> 8);
buf[i + 2] = (uint8_t)(addr >> 16);
}
}
return i;
}
#endif
#ifdef XZ_DEC_ARMTHUMB
static noinline_for_stack size_t XZ_FUNC bcj_armthumb(
struct xz_dec_bcj *s, uint8_t *buf, size_t size)
{
size_t i;
uint32_t addr;
for (i = 0; i + 4 <= size; i += 2) {
if ((buf[i + 1] & 0xF8) == 0xF0
&& (buf[i + 3] & 0xF8) == 0xF8) {
addr = (((uint32_t)buf[i + 1] & 0x07) << 19)
| ((uint32_t)buf[i] << 11)
| (((uint32_t)buf[i + 3] & 0x07) << 8)
| (uint32_t)buf[i + 2];
addr <<= 1;
addr -= s->pos + (uint32_t)i + 4;
addr >>= 1;
buf[i + 1] = (uint8_t)(0xF0 | ((addr >> 19) & 0x07));
buf[i] = (uint8_t)(addr >> 11);
buf[i + 3] = (uint8_t)(0xF8 | ((addr >> 8) & 0x07));
buf[i + 2] = (uint8_t)addr;
i += 2;
}
}
return i;
}
#endif
#ifdef XZ_DEC_SPARC
static noinline_for_stack size_t XZ_FUNC bcj_sparc(
struct xz_dec_bcj *s, uint8_t *buf, size_t size)
{
size_t i;
uint32_t instr;
for (i = 0; i + 4 <= size; i += 4) {
instr = get_unaligned_be32(buf + i);
if ((instr >> 22) == 0x100 || (instr >> 22) == 0x1FF) {
instr <<= 2;
instr -= s->pos + (uint32_t)i;
instr >>= 2;
instr = ((uint32_t)0x40000000 - (instr & 0x400000))
| 0x40000000 | (instr & 0x3FFFFF);
put_unaligned_be32(instr, buf + i);
}
}
return i;
}
#endif
/*
* Apply the selected BCJ filter. Update *pos and s->pos to match the amount
* of data that got filtered.
*
* NOTE: This is implemented as a switch statement to avoid using function
* pointers, which could be problematic in the kernel boot code, which must
* avoid pointers to static data (at least on x86).
*/
static void XZ_FUNC bcj_apply(struct xz_dec_bcj *s,
uint8_t *buf, size_t *pos, size_t size)
{
size_t filtered;
buf += *pos;
size -= *pos;
switch (s->type) {
#ifdef XZ_DEC_X86
case BCJ_X86:
filtered = bcj_x86(s, buf, size);
break;
#endif
#ifdef XZ_DEC_POWERPC
case BCJ_POWERPC:
filtered = bcj_powerpc(s, buf, size);
break;
#endif
#ifdef XZ_DEC_IA64
case BCJ_IA64:
filtered = bcj_ia64(s, buf, size);
break;
#endif
#ifdef XZ_DEC_ARM
case BCJ_ARM:
filtered = bcj_arm(s, buf, size);
break;
#endif
#ifdef XZ_DEC_ARMTHUMB
case BCJ_ARMTHUMB:
filtered = bcj_armthumb(s, buf, size);
break;
#endif
#ifdef XZ_DEC_SPARC
case BCJ_SPARC:
filtered = bcj_sparc(s, buf, size);
break;
#endif
default:
/* Never reached but silence compiler warnings. */
filtered = 0;
break;
}
*pos += filtered;
s->pos += filtered;
}
/*
* Flush pending filtered data from temp to the output buffer.
* Move the remaining mixture of possibly filtered and unfiltered
* data to the beginning of temp.
*/
static void XZ_FUNC bcj_flush(struct xz_dec_bcj *s, struct xz_buf *b)
{
size_t copy_size;
copy_size = min_t(size_t, s->temp.filtered, b->out_size - b->out_pos);
memcpy(b->out + b->out_pos, s->temp.buf, copy_size);
b->out_pos += copy_size;
s->temp.filtered -= copy_size;
s->temp.size -= copy_size;
memmove(s->temp.buf, s->temp.buf + copy_size, s->temp.size);
}
/*
* The BCJ filter functions are primitive in sense that they process the
* data in chunks of 1-16 bytes. To hide this issue, this function does
* some buffering.
*/
XZ_EXTERN enum xz_ret XZ_FUNC xz_dec_bcj_run(struct xz_dec_bcj *s,
struct xz_dec_lzma2 *lzma2, struct xz_buf *b)
{
size_t out_start;
/*
* Flush pending already filtered data to the output buffer. Return
* immediatelly if we couldn't flush everything, or if the next
* filter in the chain had already returned XZ_STREAM_END.
*/
if (s->temp.filtered > 0) {
bcj_flush(s, b);
if (s->temp.filtered > 0)
return XZ_OK;
if (s->ret == XZ_STREAM_END)
return XZ_STREAM_END;
}
/*
* If we have more output space than what is currently pending in
* temp, copy the unfiltered data from temp to the output buffer
* and try to fill the output buffer by decoding more data from the
* next filter in the chain. Apply the BCJ filter on the new data
* in the output buffer. If everything cannot be filtered, copy it
* to temp and rewind the output buffer position accordingly.
*
* This needs to be always run when temp.size == 0 to handle a special
* case where the output buffer is full and the next filter has no
* more output coming but hasn't returned XZ_STREAM_END yet.
*/
if (s->temp.size < b->out_size - b->out_pos || s->temp.size == 0) {
out_start = b->out_pos;
memcpy(b->out + b->out_pos, s->temp.buf, s->temp.size);
b->out_pos += s->temp.size;
s->ret = xz_dec_lzma2_run(lzma2, b);
if (s->ret != XZ_STREAM_END
&& (s->ret != XZ_OK || s->single_call))
return s->ret;
bcj_apply(s, b->out, &out_start, b->out_pos);
/*
* As an exception, if the next filter returned XZ_STREAM_END,
* we can do that too, since the last few bytes that remain
* unfiltered are meant to remain unfiltered.
*/
if (s->ret == XZ_STREAM_END)
return XZ_STREAM_END;
s->temp.size = b->out_pos - out_start;
b->out_pos -= s->temp.size;
memcpy(s->temp.buf, b->out + b->out_pos, s->temp.size);
/*
* If there wasn't enough input to the next filter to fill
* the output buffer with unfiltered data, there's no point
* to try decoding more data to temp.
*/
if (b->out_pos + s->temp.size < b->out_size)
return XZ_OK;
}
/*
* We have unfiltered data in temp. If the output buffer isn't full
* yet, try to fill the temp buffer by decoding more data from the
* next filter. Apply the BCJ filter on temp. Then we hopefully can
* fill the actual output buffer by copying filtered data from temp.
* A mix of filtered and unfiltered data may be left in temp; it will
* be taken care on the next call to this function.
*/
if (b->out_pos < b->out_size) {
/* Make b->out{,_pos,_size} temporarily point to s->temp. */
s->out = b->out;
s->out_pos = b->out_pos;
s->out_size = b->out_size;
b->out = s->temp.buf;
b->out_pos = s->temp.size;
b->out_size = sizeof(s->temp.buf);
s->ret = xz_dec_lzma2_run(lzma2, b);
s->temp.size = b->out_pos;
b->out = s->out;
b->out_pos = s->out_pos;
b->out_size = s->out_size;
if (s->ret != XZ_OK && s->ret != XZ_STREAM_END)
return s->ret;
bcj_apply(s, s->temp.buf, &s->temp.filtered, s->temp.size);
/*
* If the next filter returned XZ_STREAM_END, we mark that
* everything is filtered, since the last unfiltered bytes
* of the stream are meant to be left as is.
*/
if (s->ret == XZ_STREAM_END)
s->temp.filtered = s->temp.size;
bcj_flush(s, b);
if (s->temp.filtered > 0)
return XZ_OK;
}
return s->ret;
}
XZ_EXTERN struct xz_dec_bcj * XZ_FUNC xz_dec_bcj_create(bool single_call)
{
struct xz_dec_bcj *s = kmalloc(sizeof(*s), GFP_KERNEL);
if (s != NULL)
s->single_call = single_call;
return s;
}
XZ_EXTERN enum xz_ret XZ_FUNC xz_dec_bcj_reset(
struct xz_dec_bcj *s, uint8_t id)
{
switch (id) {
#ifdef XZ_DEC_X86
case BCJ_X86:
#endif
#ifdef XZ_DEC_POWERPC
case BCJ_POWERPC:
#endif
#ifdef XZ_DEC_IA64
case BCJ_IA64:
#endif
#ifdef XZ_DEC_ARM
case BCJ_ARM:
#endif
#ifdef XZ_DEC_ARMTHUMB
case BCJ_ARMTHUMB:
#endif
#ifdef XZ_DEC_SPARC
case BCJ_SPARC:
#endif
break;
default:
/* Unsupported Filter ID */
return XZ_OPTIONS_ERROR;
}
s->type = id;
s->ret = XZ_OK;
s->pos = 0;
s->x86_prev_mask = 0;
s->temp.filtered = 0;
s->temp.size = 0;
return XZ_OK;
}
#endif

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,820 @@
/*
* .xz Stream decoder
*
* Author: Lasse Collin <lasse.collin@tukaani.org>
*
* This file has been put into the public domain.
* You can do whatever you want with this file.
*/
#include "xz_private.h"
#include "xz_stream.h"
/* Hash used to validate the Index field */
struct xz_dec_hash {
vli_type unpadded;
vli_type uncompressed;
uint32_t crc32;
};
struct xz_dec {
/* Position in dec_main() */
enum {
SEQ_STREAM_HEADER,
SEQ_BLOCK_START,
SEQ_BLOCK_HEADER,
SEQ_BLOCK_UNCOMPRESS,
SEQ_BLOCK_PADDING,
SEQ_BLOCK_CHECK,
SEQ_INDEX,
SEQ_INDEX_PADDING,
SEQ_INDEX_CRC32,
SEQ_STREAM_FOOTER
} sequence;
/* Position in variable-length integers and Check fields */
uint32_t pos;
/* Variable-length integer decoded by dec_vli() */
vli_type vli;
/* Saved in_pos and out_pos */
size_t in_start;
size_t out_start;
/* CRC32 value in Block or Index */
uint32_t crc32;
/* Type of the integrity check calculated from uncompressed data */
enum xz_check check_type;
/* Operation mode */
enum xz_mode mode;
/*
* True if the next call to xz_dec_run() is allowed to return
* XZ_BUF_ERROR.
*/
bool allow_buf_error;
/* Information stored in Block Header */
struct {
/*
* Value stored in the Compressed Size field, or
* VLI_UNKNOWN if Compressed Size is not present.
*/
vli_type compressed;
/*
* Value stored in the Uncompressed Size field, or
* VLI_UNKNOWN if Uncompressed Size is not present.
*/
vli_type uncompressed;
/* Size of the Block Header field */
uint32_t size;
} block_header;
/* Information collected when decoding Blocks */
struct {
/* Observed compressed size of the current Block */
vli_type compressed;
/* Observed uncompressed size of the current Block */
vli_type uncompressed;
/* Number of Blocks decoded so far */
vli_type count;
/*
* Hash calculated from the Block sizes. This is used to
* validate the Index field.
*/
struct xz_dec_hash hash;
} block;
/* Variables needed when verifying the Index field */
struct {
/* Position in dec_index() */
enum {
SEQ_INDEX_COUNT,
SEQ_INDEX_UNPADDED,
SEQ_INDEX_UNCOMPRESSED
} sequence;
/* Size of the Index in bytes */
vli_type size;
/* Number of Records (matches block.count in valid files) */
vli_type count;
/*
* Hash calculated from the Records (matches block.hash in
* valid files).
*/
struct xz_dec_hash hash;
} index;
/*
* Temporary buffer needed to hold Stream Header, Block Header,
* and Stream Footer. The Block Header is the biggest (1 KiB)
* so we reserve space according to that. buf[] has to be aligned
* to a multiple of four bytes; the size_t variables before it
* should guarantee this.
*/
struct {
size_t pos;
size_t size;
uint8_t buf[1024];
} temp;
struct xz_dec_lzma2 *lzma2;
#ifdef XZ_DEC_BCJ
struct xz_dec_bcj *bcj;
bool bcj_active;
#endif
};
#ifdef XZ_DEC_ANY_CHECK
/* Sizes of the Check field with different Check IDs */
static const uint8_t check_sizes[16] = {
0,
4, 4, 4,
8, 8, 8,
16, 16, 16,
32, 32, 32,
64, 64, 64
};
#endif
/*
* Fill s->temp by copying data starting from b->in[b->in_pos]. Caller
* must have set s->temp.pos to indicate how much data we are supposed
* to copy into s->temp.buf. Return true once s->temp.pos has reached
* s->temp.size.
*/
static bool XZ_FUNC fill_temp(struct xz_dec *s, struct xz_buf *b)
{
size_t copy_size = min_t(size_t,
b->in_size - b->in_pos, s->temp.size - s->temp.pos);
memcpy(s->temp.buf + s->temp.pos, b->in + b->in_pos, copy_size);
b->in_pos += copy_size;
s->temp.pos += copy_size;
if (s->temp.pos == s->temp.size) {
s->temp.pos = 0;
return true;
}
return false;
}
/* Decode a variable-length integer (little-endian base-128 encoding) */
static enum xz_ret XZ_FUNC dec_vli(struct xz_dec *s,
const uint8_t *in, size_t *in_pos, size_t in_size)
{
uint8_t byte;
if (s->pos == 0)
s->vli = 0;
while (*in_pos < in_size) {
byte = in[*in_pos];
++*in_pos;
s->vli |= (vli_type)(byte & 0x7F) << s->pos;
if ((byte & 0x80) == 0) {
/* Don't allow non-minimal encodings. */
if (byte == 0 && s->pos != 0)
return XZ_DATA_ERROR;
s->pos = 0;
return XZ_STREAM_END;
}
s->pos += 7;
if (s->pos == 7 * VLI_BYTES_MAX)
return XZ_DATA_ERROR;
}
return XZ_OK;
}
/*
* Decode the Compressed Data field from a Block. Update and validate
* the observed compressed and uncompressed sizes of the Block so that
* they don't exceed the values possibly stored in the Block Header
* (validation assumes that no integer overflow occurs, since vli_type
* is normally uint64_t). Update the CRC32 if presence of the CRC32
* field was indicated in Stream Header.
*
* Once the decoding is finished, validate that the observed sizes match
* the sizes possibly stored in the Block Header. Update the hash and
* Block count, which are later used to validate the Index field.
*/
static enum xz_ret XZ_FUNC dec_block(struct xz_dec *s, struct xz_buf *b)
{
enum xz_ret ret;
s->in_start = b->in_pos;
s->out_start = b->out_pos;
#ifdef XZ_DEC_BCJ
if (s->bcj_active)
ret = xz_dec_bcj_run(s->bcj, s->lzma2, b);
else
#endif
ret = xz_dec_lzma2_run(s->lzma2, b);
s->block.compressed += b->in_pos - s->in_start;
s->block.uncompressed += b->out_pos - s->out_start;
/*
* There is no need to separately check for VLI_UNKNOWN, since
* the observed sizes are always smaller than VLI_UNKNOWN.
*/
if (s->block.compressed > s->block_header.compressed
|| s->block.uncompressed
> s->block_header.uncompressed)
return XZ_DATA_ERROR;
if (s->check_type == XZ_CHECK_CRC32)
s->crc32 = xz_crc32(b->out + s->out_start,
b->out_pos - s->out_start, s->crc32);
if (ret == XZ_STREAM_END) {
if (s->block_header.compressed != VLI_UNKNOWN
&& s->block_header.compressed
!= s->block.compressed)
return XZ_DATA_ERROR;
if (s->block_header.uncompressed != VLI_UNKNOWN
&& s->block_header.uncompressed
!= s->block.uncompressed)
return XZ_DATA_ERROR;
s->block.hash.unpadded += s->block_header.size
+ s->block.compressed;
#ifdef XZ_DEC_ANY_CHECK
s->block.hash.unpadded += check_sizes[s->check_type];
#else
if (s->check_type == XZ_CHECK_CRC32)
s->block.hash.unpadded += 4;
#endif
s->block.hash.uncompressed += s->block.uncompressed;
s->block.hash.crc32 = xz_crc32(
(const uint8_t *)&s->block.hash,
sizeof(s->block.hash), s->block.hash.crc32);
++s->block.count;
}
return ret;
}
/* Update the Index size and the CRC32 value. */
static void XZ_FUNC index_update(struct xz_dec *s, const struct xz_buf *b)
{
size_t in_used = b->in_pos - s->in_start;
s->index.size += in_used;
s->crc32 = xz_crc32(b->in + s->in_start, in_used, s->crc32);
}
/*
* Decode the Number of Records, Unpadded Size, and Uncompressed Size
* fields from the Index field. That is, Index Padding and CRC32 are not
* decoded by this function.
*
* This can return XZ_OK (more input needed), XZ_STREAM_END (everything
* successfully decoded), or XZ_DATA_ERROR (input is corrupt).
*/
static enum xz_ret XZ_FUNC dec_index(struct xz_dec *s, struct xz_buf *b)
{
enum xz_ret ret;
do {
ret = dec_vli(s, b->in, &b->in_pos, b->in_size);
if (ret != XZ_STREAM_END) {
index_update(s, b);
return ret;
}
switch (s->index.sequence) {
case SEQ_INDEX_COUNT:
s->index.count = s->vli;
/*
* Validate that the Number of Records field
* indicates the same number of Records as
* there were Blocks in the Stream.
*/
if (s->index.count != s->block.count)
return XZ_DATA_ERROR;
s->index.sequence = SEQ_INDEX_UNPADDED;
break;
case SEQ_INDEX_UNPADDED:
s->index.hash.unpadded += s->vli;
s->index.sequence = SEQ_INDEX_UNCOMPRESSED;
break;
case SEQ_INDEX_UNCOMPRESSED:
s->index.hash.uncompressed += s->vli;
s->index.hash.crc32 = xz_crc32(
(const uint8_t *)&s->index.hash,
sizeof(s->index.hash),
s->index.hash.crc32);
--s->index.count;
s->index.sequence = SEQ_INDEX_UNPADDED;
break;
}
} while (s->index.count > 0);
return XZ_STREAM_END;
}
/*
* Validate that the next four input bytes match the value of s->crc32.
* s->pos must be zero when starting to validate the first byte.
*/
static enum xz_ret XZ_FUNC crc32_validate(struct xz_dec *s, struct xz_buf *b)
{
do {
if (b->in_pos == b->in_size)
return XZ_OK;
if (((s->crc32 >> s->pos) & 0xFF) != b->in[b->in_pos++])
return XZ_DATA_ERROR;
s->pos += 8;
} while (s->pos < 32);
s->crc32 = 0;
s->pos = 0;
return XZ_STREAM_END;
}
#ifdef XZ_DEC_ANY_CHECK
/*
* Skip over the Check field when the Check ID is not supported.
* Returns true once the whole Check field has been skipped over.
*/
static bool XZ_FUNC check_skip(struct xz_dec *s, struct xz_buf *b)
{
while (s->pos < check_sizes[s->check_type]) {
if (b->in_pos == b->in_size)
return false;
++b->in_pos;
++s->pos;
}
s->pos = 0;
return true;
}
#endif
/* Decode the Stream Header field (the first 12 bytes of the .xz Stream). */
static enum xz_ret XZ_FUNC dec_stream_header(struct xz_dec *s)
{
if (!memeq(s->temp.buf, HEADER_MAGIC, HEADER_MAGIC_SIZE))
return XZ_FORMAT_ERROR;
if (xz_crc32(s->temp.buf + HEADER_MAGIC_SIZE, 2, 0)
!= get_le32(s->temp.buf + HEADER_MAGIC_SIZE + 2))
return XZ_DATA_ERROR;
if (s->temp.buf[HEADER_MAGIC_SIZE] != 0)
return XZ_OPTIONS_ERROR;
/*
* Of integrity checks, we support only none (Check ID = 0) and
* CRC32 (Check ID = 1). However, if XZ_DEC_ANY_CHECK is defined,
* we will accept other check types too, but then the check won't
* be verified and a warning (XZ_UNSUPPORTED_CHECK) will be given.
*/
s->check_type = s->temp.buf[HEADER_MAGIC_SIZE + 1];
#ifdef XZ_DEC_ANY_CHECK
if (s->check_type > XZ_CHECK_MAX)
return XZ_OPTIONS_ERROR;
if (s->check_type > XZ_CHECK_CRC32)
return XZ_UNSUPPORTED_CHECK;
#else
if (s->check_type > XZ_CHECK_CRC32)
return XZ_OPTIONS_ERROR;
#endif
return XZ_OK;
}
/* Decode the Stream Footer field (the last 12 bytes of the .xz Stream) */
static enum xz_ret XZ_FUNC dec_stream_footer(struct xz_dec *s)
{
if (!memeq(s->temp.buf + 10, FOOTER_MAGIC, FOOTER_MAGIC_SIZE))
return XZ_DATA_ERROR;
if (xz_crc32(s->temp.buf + 4, 6, 0) != get_le32(s->temp.buf))
return XZ_DATA_ERROR;
/*
* Validate Backward Size. Note that we never added the size of the
* Index CRC32 field to s->index.size, thus we use s->index.size / 4
* instead of s->index.size / 4 - 1.
*/
if ((s->index.size >> 2) != get_le32(s->temp.buf + 4))
return XZ_DATA_ERROR;
if (s->temp.buf[8] != 0 || s->temp.buf[9] != s->check_type)
return XZ_DATA_ERROR;
/*
* Use XZ_STREAM_END instead of XZ_OK to be more convenient
* for the caller.
*/
return XZ_STREAM_END;
}
/* Decode the Block Header and initialize the filter chain. */
static enum xz_ret XZ_FUNC dec_block_header(struct xz_dec *s)
{
enum xz_ret ret;
/*
* Validate the CRC32. We know that the temp buffer is at least
* eight bytes so this is safe.
*/
s->temp.size -= 4;
if (xz_crc32(s->temp.buf, s->temp.size, 0)
!= get_le32(s->temp.buf + s->temp.size))
return XZ_DATA_ERROR;
s->temp.pos = 2;
/*
* Catch unsupported Block Flags. We support only one or two filters
* in the chain, so we catch that with the same test.
*/
#ifdef XZ_DEC_BCJ
if (s->temp.buf[1] & 0x3E)
#else
if (s->temp.buf[1] & 0x3F)
#endif
return XZ_OPTIONS_ERROR;
/* Compressed Size */
if (s->temp.buf[1] & 0x40) {
if (dec_vli(s, s->temp.buf, &s->temp.pos, s->temp.size)
!= XZ_STREAM_END)
return XZ_DATA_ERROR;
s->block_header.compressed = s->vli;
} else {
s->block_header.compressed = VLI_UNKNOWN;
}
/* Uncompressed Size */
if (s->temp.buf[1] & 0x80) {
if (dec_vli(s, s->temp.buf, &s->temp.pos, s->temp.size)
!= XZ_STREAM_END)
return XZ_DATA_ERROR;
s->block_header.uncompressed = s->vli;
} else {
s->block_header.uncompressed = VLI_UNKNOWN;
}
#ifdef XZ_DEC_BCJ
/* If there are two filters, the first one must be a BCJ filter. */
s->bcj_active = s->temp.buf[1] & 0x01;
if (s->bcj_active) {
if (s->temp.size - s->temp.pos < 2)
return XZ_OPTIONS_ERROR;
ret = xz_dec_bcj_reset(s->bcj, s->temp.buf[s->temp.pos++]);
if (ret != XZ_OK)
return ret;
/*
* We don't support custom start offset,
* so Size of Properties must be zero.
*/
if (s->temp.buf[s->temp.pos++] != 0x00)
return XZ_OPTIONS_ERROR;
}
#endif
/* Valid Filter Flags always take at least two bytes. */
if (s->temp.size - s->temp.pos < 2)
return XZ_DATA_ERROR;
/* Filter ID = LZMA2 */
if (s->temp.buf[s->temp.pos++] != 0x21)
return XZ_OPTIONS_ERROR;
/* Size of Properties = 1-byte Filter Properties */
if (s->temp.buf[s->temp.pos++] != 0x01)
return XZ_OPTIONS_ERROR;
/* Filter Properties contains LZMA2 dictionary size. */
if (s->temp.size - s->temp.pos < 1)
return XZ_DATA_ERROR;
ret = xz_dec_lzma2_reset(s->lzma2, s->temp.buf[s->temp.pos++]);
if (ret != XZ_OK)
return ret;
/* The rest must be Header Padding. */
while (s->temp.pos < s->temp.size)
if (s->temp.buf[s->temp.pos++] != 0x00)
return XZ_OPTIONS_ERROR;
s->temp.pos = 0;
s->block.compressed = 0;
s->block.uncompressed = 0;
return XZ_OK;
}
static NOINLINE enum xz_ret XZ_FUNC dec_main(struct xz_dec *s, struct xz_buf *b)
{
enum xz_ret ret;
/*
* Store the start position for the case when we are in the middle
* of the Index field.
*/
s->in_start = b->in_pos;
while (true) {
switch (s->sequence) {
case SEQ_STREAM_HEADER:
/*
* Stream Header is copied to s->temp, and then
* decoded from there. This way if the caller
* gives us only little input at a time, we can
* still keep the Stream Header decoding code
* simple. Similar approach is used in many places
* in this file.
*/
if (!fill_temp(s, b))
return XZ_OK;
/*
* If dec_stream_header() returns
* XZ_UNSUPPORTED_CHECK, it is still possible
* to continue decoding if working in multi-call
* mode. Thus, update s->sequence before calling
* dec_stream_header().
*/
s->sequence = SEQ_BLOCK_START;
ret = dec_stream_header(s);
if (ret != XZ_OK)
return ret;
case SEQ_BLOCK_START:
/* We need one byte of input to continue. */
if (b->in_pos == b->in_size)
return XZ_OK;
/* See if this is the beginning of the Index field. */
if (b->in[b->in_pos] == 0) {
s->in_start = b->in_pos++;
s->sequence = SEQ_INDEX;
break;
}
/*
* Calculate the size of the Block Header and
* prepare to decode it.
*/
s->block_header.size
= ((uint32_t)b->in[b->in_pos] + 1) * 4;
s->temp.size = s->block_header.size;
s->temp.pos = 0;
s->sequence = SEQ_BLOCK_HEADER;
case SEQ_BLOCK_HEADER:
if (!fill_temp(s, b))
return XZ_OK;
ret = dec_block_header(s);
if (ret != XZ_OK)
return ret;
s->sequence = SEQ_BLOCK_UNCOMPRESS;
case SEQ_BLOCK_UNCOMPRESS:
ret = dec_block(s, b);
if (ret != XZ_STREAM_END)
return ret;
s->sequence = SEQ_BLOCK_PADDING;
case SEQ_BLOCK_PADDING:
/*
* Size of Compressed Data + Block Padding
* must be a multiple of four. We don't need
* s->block.compressed for anything else
* anymore, so we use it here to test the size
* of the Block Padding field.
*/
while (s->block.compressed & 3) {
if (b->in_pos == b->in_size)
return XZ_OK;
if (b->in[b->in_pos++] != 0)
return XZ_DATA_ERROR;
++s->block.compressed;
}
s->sequence = SEQ_BLOCK_CHECK;
case SEQ_BLOCK_CHECK:
if (s->check_type == XZ_CHECK_CRC32) {
ret = crc32_validate(s, b);
if (ret != XZ_STREAM_END)
return ret;
}
#ifdef XZ_DEC_ANY_CHECK
else if (!check_skip(s, b)) {
return XZ_OK;
}
#endif
s->sequence = SEQ_BLOCK_START;
break;
case SEQ_INDEX:
ret = dec_index(s, b);
if (ret != XZ_STREAM_END)
return ret;
s->sequence = SEQ_INDEX_PADDING;
case SEQ_INDEX_PADDING:
while ((s->index.size + (b->in_pos - s->in_start))
& 3) {
if (b->in_pos == b->in_size) {
index_update(s, b);
return XZ_OK;
}
if (b->in[b->in_pos++] != 0)
return XZ_DATA_ERROR;
}
/* Finish the CRC32 value and Index size. */
index_update(s, b);
/* Compare the hashes to validate the Index field. */
if (!memeq(&s->block.hash, &s->index.hash,
sizeof(s->block.hash)))
return XZ_DATA_ERROR;
s->sequence = SEQ_INDEX_CRC32;
case SEQ_INDEX_CRC32:
ret = crc32_validate(s, b);
if (ret != XZ_STREAM_END)
return ret;
s->temp.size = STREAM_HEADER_SIZE;
s->sequence = SEQ_STREAM_FOOTER;
case SEQ_STREAM_FOOTER:
if (!fill_temp(s, b))
return XZ_OK;
return dec_stream_footer(s);
}
}
/* Never reached */
}
/*
* xz_dec_run() is a wrapper for dec_main() to handle some special cases in
* multi-call and single-call decoding.
*
* In multi-call mode, we must return XZ_BUF_ERROR when it seems clear that we
* are not going to make any progress anymore. This is to prevent the caller
* from calling us infinitely when the input file is truncated or otherwise
* corrupt. Since zlib-style API allows that the caller fills the input buffer
* only when the decoder doesn't produce any new output, we have to be careful
* to avoid returning XZ_BUF_ERROR too easily: XZ_BUF_ERROR is returned only
* after the second consecutive call to xz_dec_run() that makes no progress.
*
* In single-call mode, if we couldn't decode everything and no error
* occurred, either the input is truncated or the output buffer is too small.
* Since we know that the last input byte never produces any output, we know
* that if all the input was consumed and decoding wasn't finished, the file
* must be corrupt. Otherwise the output buffer has to be too small or the
* file is corrupt in a way that decoding it produces too big output.
*
* If single-call decoding fails, we reset b->in_pos and b->out_pos back to
* their original values. This is because with some filter chains there won't
* be any valid uncompressed data in the output buffer unless the decoding
* actually succeeds (that's the price to pay of using the output buffer as
* the workspace).
*/
XZ_EXTERN enum xz_ret XZ_FUNC xz_dec_run(struct xz_dec *s, struct xz_buf *b)
{
size_t in_start;
size_t out_start;
enum xz_ret ret;
if (DEC_IS_SINGLE(s->mode))
xz_dec_reset(s);
in_start = b->in_pos;
out_start = b->out_pos;
ret = dec_main(s, b);
if (DEC_IS_SINGLE(s->mode)) {
if (ret == XZ_OK)
ret = b->in_pos == b->in_size
? XZ_DATA_ERROR : XZ_BUF_ERROR;
if (ret != XZ_STREAM_END) {
b->in_pos = in_start;
b->out_pos = out_start;
}
} else if (ret == XZ_OK && in_start == b->in_pos
&& out_start == b->out_pos) {
if (s->allow_buf_error)
ret = XZ_BUF_ERROR;
s->allow_buf_error = true;
} else {
s->allow_buf_error = false;
}
return ret;
}
XZ_EXTERN struct xz_dec * XZ_FUNC xz_dec_init(
enum xz_mode mode, uint32_t dict_max)
{
struct xz_dec *s = kmalloc(sizeof(*s), GFP_KERNEL);
if (s == NULL)
return NULL;
s->mode = mode;
#ifdef XZ_DEC_BCJ
s->bcj = xz_dec_bcj_create(DEC_IS_SINGLE(mode));
if (s->bcj == NULL)
goto error_bcj;
#endif
s->lzma2 = xz_dec_lzma2_create(mode, dict_max);
if (s->lzma2 == NULL)
goto error_lzma2;
xz_dec_reset(s);
return s;
error_lzma2:
#ifdef XZ_DEC_BCJ
xz_dec_bcj_end(s->bcj);
error_bcj:
#endif
kfree(s);
return NULL;
}
XZ_EXTERN void XZ_FUNC xz_dec_reset(struct xz_dec *s)
{
s->sequence = SEQ_STREAM_HEADER;
s->allow_buf_error = false;
s->pos = 0;
s->crc32 = 0;
memzero(&s->block, sizeof(s->block));
memzero(&s->index, sizeof(s->index));
s->temp.pos = 0;
s->temp.size = STREAM_HEADER_SIZE;
}
XZ_EXTERN void XZ_FUNC xz_dec_end(struct xz_dec *s)
{
if (s != NULL) {
xz_dec_lzma2_end(s->lzma2);
#ifdef XZ_DEC_BCJ
xz_dec_bcj_end(s->bcj);
#endif
kfree(s);
}
}

View File

@ -0,0 +1,204 @@
/*
* LZMA2 definitions
*
* Authors: Lasse Collin <lasse.collin@tukaani.org>
* Igor Pavlov <http://7-zip.org/>
*
* This file has been put into the public domain.
* You can do whatever you want with this file.
*/
#ifndef XZ_LZMA2_H
#define XZ_LZMA2_H
/* Range coder constants */
#define RC_SHIFT_BITS 8
#define RC_TOP_BITS 24
#define RC_TOP_VALUE (1 << RC_TOP_BITS)
#define RC_BIT_MODEL_TOTAL_BITS 11
#define RC_BIT_MODEL_TOTAL (1 << RC_BIT_MODEL_TOTAL_BITS)
#define RC_MOVE_BITS 5
/*
* Maximum number of position states. A position state is the lowest pb
* number of bits of the current uncompressed offset. In some places there
* are different sets of probabilities for different position states.
*/
#define POS_STATES_MAX (1 << 4)
/*
* This enum is used to track which LZMA symbols have occurred most recently
* and in which order. This information is used to predict the next symbol.
*
* Symbols:
* - Literal: One 8-bit byte
* - Match: Repeat a chunk of data at some distance
* - Long repeat: Multi-byte match at a recently seen distance
* - Short repeat: One-byte repeat at a recently seen distance
*
* The symbol names are in from STATE_oldest_older_previous. REP means
* either short or long repeated match, and NONLIT means any non-literal.
*/
enum lzma_state {
STATE_LIT_LIT,
STATE_MATCH_LIT_LIT,
STATE_REP_LIT_LIT,
STATE_SHORTREP_LIT_LIT,
STATE_MATCH_LIT,
STATE_REP_LIT,
STATE_SHORTREP_LIT,
STATE_LIT_MATCH,
STATE_LIT_LONGREP,
STATE_LIT_SHORTREP,
STATE_NONLIT_MATCH,
STATE_NONLIT_REP
};
/* Total number of states */
#define STATES 12
/* The lowest 7 states indicate that the previous state was a literal. */
#define LIT_STATES 7
/* Indicate that the latest symbol was a literal. */
static inline void XZ_FUNC lzma_state_literal(enum lzma_state *state)
{
if (*state <= STATE_SHORTREP_LIT_LIT)
*state = STATE_LIT_LIT;
else if (*state <= STATE_LIT_SHORTREP)
*state -= 3;
else
*state -= 6;
}
/* Indicate that the latest symbol was a match. */
static inline void XZ_FUNC lzma_state_match(enum lzma_state *state)
{
*state = *state < LIT_STATES ? STATE_LIT_MATCH : STATE_NONLIT_MATCH;
}
/* Indicate that the latest state was a long repeated match. */
static inline void XZ_FUNC lzma_state_long_rep(enum lzma_state *state)
{
*state = *state < LIT_STATES ? STATE_LIT_LONGREP : STATE_NONLIT_REP;
}
/* Indicate that the latest symbol was a short match. */
static inline void XZ_FUNC lzma_state_short_rep(enum lzma_state *state)
{
*state = *state < LIT_STATES ? STATE_LIT_SHORTREP : STATE_NONLIT_REP;
}
/* Test if the previous symbol was a literal. */
static inline bool XZ_FUNC lzma_state_is_literal(enum lzma_state state)
{
return state < LIT_STATES;
}
/* Each literal coder is divided in three sections:
* - 0x001-0x0FF: Without match byte
* - 0x101-0x1FF: With match byte; match bit is 0
* - 0x201-0x2FF: With match byte; match bit is 1
*
* Match byte is used when the previous LZMA symbol was something else than
* a literal (that is, it was some kind of match).
*/
#define LITERAL_CODER_SIZE 0x300
/* Maximum number of literal coders */
#define LITERAL_CODERS_MAX (1 << 4)
/* Minimum length of a match is two bytes. */
#define MATCH_LEN_MIN 2
/* Match length is encoded with 4, 5, or 10 bits.
*
* Length Bits
* 2-9 4 = Choice=0 + 3 bits
* 10-17 5 = Choice=1 + Choice2=0 + 3 bits
* 18-273 10 = Choice=1 + Choice2=1 + 8 bits
*/
#define LEN_LOW_BITS 3
#define LEN_LOW_SYMBOLS (1 << LEN_LOW_BITS)
#define LEN_MID_BITS 3
#define LEN_MID_SYMBOLS (1 << LEN_MID_BITS)
#define LEN_HIGH_BITS 8
#define LEN_HIGH_SYMBOLS (1 << LEN_HIGH_BITS)
#define LEN_SYMBOLS (LEN_LOW_SYMBOLS + LEN_MID_SYMBOLS + LEN_HIGH_SYMBOLS)
/*
* Maximum length of a match is 273 which is a result of the encoding
* described above.
*/
#define MATCH_LEN_MAX (MATCH_LEN_MIN + LEN_SYMBOLS - 1)
/*
* Different sets of probabilities are used for match distances that have
* very short match length: Lengths of 2, 3, and 4 bytes have a separate
* set of probabilities for each length. The matches with longer length
* use a shared set of probabilities.
*/
#define DIST_STATES 4
/*
* Get the index of the appropriate probability array for decoding
* the distance slot.
*/
static inline uint32_t XZ_FUNC lzma_get_dist_state(uint32_t len)
{
return len < DIST_STATES + MATCH_LEN_MIN
? len - MATCH_LEN_MIN : DIST_STATES - 1;
}
/*
* The highest two bits of a 32-bit match distance are encoded using six bits.
* This six-bit value is called a distance slot. This way encoding a 32-bit
* value takes 6-36 bits, larger values taking more bits.
*/
#define DIST_SLOT_BITS 6
#define DIST_SLOTS (1 << DIST_SLOT_BITS)
/* Match distances up to 127 are fully encoded using probabilities. Since
* the highest two bits (distance slot) are always encoded using six bits,
* the distances 0-3 don't need any additional bits to encode, since the
* distance slot itself is the same as the actual distance. DIST_MODEL_START
* indicates the first distance slot where at least one additional bit is
* needed.
*/
#define DIST_MODEL_START 4
/*
* Match distances greater than 127 are encoded in three pieces:
* - distance slot: the highest two bits
* - direct bits: 2-26 bits below the highest two bits
* - alignment bits: four lowest bits
*
* Direct bits don't use any probabilities.
*
* The distance slot value of 14 is for distances 128-191.
*/
#define DIST_MODEL_END 14
/* Distance slots that indicate a distance <= 127. */
#define FULL_DISTANCES_BITS (DIST_MODEL_END / 2)
#define FULL_DISTANCES (1 << FULL_DISTANCES_BITS)
/*
* For match distances greater than 127, only the highest two bits and the
* lowest four bits (alignment) is encoded using probabilities.
*/
#define ALIGN_BITS 4
#define ALIGN_SIZE (1 << ALIGN_BITS)
#define ALIGN_MASK (ALIGN_SIZE - 1)
/* Total number of all probability variables */
#define PROBS_TOTAL (1846 + LITERAL_CODERS_MAX * LITERAL_CODER_SIZE)
/*
* LZMA remembers the four most recent match distances. Reusing these
* distances tends to take less space than re-encoding the actual
* distance value.
*/
#define REPS 4
#endif

View File

@ -0,0 +1,159 @@
/*
* Private includes and definitions
*
* Author: Lasse Collin <lasse.collin@tukaani.org>
*
* This file has been put into the public domain.
* You can do whatever you want with this file.
*/
#ifndef XZ_PRIVATE_H
#define XZ_PRIVATE_H
#ifdef __KERNEL__
/* XZ_PREBOOT may be defined only via decompress_unxz.c. */
# ifndef XZ_PREBOOT
# include <linux/slab.h>
# include <linux/vmalloc.h>
# include <linux/string.h>
# define memeq(a, b, size) (memcmp(a, b, size) == 0)
# define memzero(buf, size) memset(buf, 0, size)
# endif
# include <asm/byteorder.h>
# include <asm/unaligned.h>
# define get_le32(p) le32_to_cpup((const uint32_t *)(p))
/* XZ_IGNORE_KCONFIG may be defined only via decompress_unxz.c. */
# ifndef XZ_IGNORE_KCONFIG
# ifdef CONFIG_XZ_DEC_X86
# define XZ_DEC_X86
# endif
# ifdef CONFIG_XZ_DEC_POWERPC
# define XZ_DEC_POWERPC
# endif
# ifdef CONFIG_XZ_DEC_IA64
# define XZ_DEC_IA64
# endif
# ifdef CONFIG_XZ_DEC_ARM
# define XZ_DEC_ARM
# endif
# ifdef CONFIG_XZ_DEC_ARMTHUMB
# define XZ_DEC_ARMTHUMB
# endif
# ifdef CONFIG_XZ_DEC_SPARC
# define XZ_DEC_SPARC
# endif
# endif
# include <linux/xz.h>
#else
/*
* For userspace builds, use a separate header to define the required
* macros and functions. This makes it easier to adapt the code into
* different environments and avoids clutter in the Linux kernel tree.
*/
# include "xz_config.h"
#endif
/* If no specific decoding mode is requested, enable support for all modes. */
#if !defined(XZ_DEC_SINGLE) && !defined(XZ_DEC_PREALLOC) \
&& !defined(XZ_DEC_DYNALLOC)
# define XZ_DEC_SINGLE
# define XZ_DEC_PREALLOC
# define XZ_DEC_DYNALLOC
#endif
/*
* The DEC_IS_foo(mode) macros are used in "if" statements. If only some
* of the supported modes are enabled, these macros will evaluate to true or
* false at compile time and thus allow the compiler to omit unneeded code.
*/
#ifdef XZ_DEC_SINGLE
# define DEC_IS_SINGLE(mode) ((mode) == XZ_SINGLE)
#else
# define DEC_IS_SINGLE(mode) (false)
#endif
#ifdef XZ_DEC_PREALLOC
# define DEC_IS_PREALLOC(mode) ((mode) == XZ_PREALLOC)
#else
# define DEC_IS_PREALLOC(mode) (false)
#endif
#ifdef XZ_DEC_DYNALLOC
# define DEC_IS_DYNALLOC(mode) ((mode) == XZ_DYNALLOC)
#else
# define DEC_IS_DYNALLOC(mode) (false)
#endif
#if !defined(XZ_DEC_SINGLE)
# define DEC_IS_MULTI(mode) (true)
#elif defined(XZ_DEC_PREALLOC) || defined(XZ_DEC_DYNALLOC)
# define DEC_IS_MULTI(mode) ((mode) != XZ_SINGLE)
#else
# define DEC_IS_MULTI(mode) (false)
#endif
/*
* If any of the BCJ filter decoders are wanted, define XZ_DEC_BCJ.
* XZ_DEC_BCJ is used to enable generic support for BCJ decoders.
*/
#ifndef XZ_DEC_BCJ
# if defined(XZ_DEC_X86) || defined(XZ_DEC_POWERPC) \
|| defined(XZ_DEC_IA64) || defined(XZ_DEC_ARM) \
|| defined(XZ_DEC_ARM) || defined(XZ_DEC_ARMTHUMB) \
|| defined(XZ_DEC_SPARC)
# define XZ_DEC_BCJ
# endif
#endif
/*
* Allocate memory for LZMA2 decoder. xz_dec_lzma2_reset() must be used
* before calling xz_dec_lzma2_run().
*/
XZ_EXTERN struct xz_dec_lzma2 * XZ_FUNC xz_dec_lzma2_create(
enum xz_mode mode, uint32_t dict_max);
/*
* Decode the LZMA2 properties (one byte) and reset the decoder. Return
* XZ_OK on success, XZ_MEMLIMIT_ERROR if the preallocated dictionary is not
* big enough, and XZ_OPTIONS_ERROR if props indicates something that this
* decoder doesn't support.
*/
XZ_EXTERN enum xz_ret XZ_FUNC xz_dec_lzma2_reset(
struct xz_dec_lzma2 *s, uint8_t props);
/* Decode raw LZMA2 stream from b->in to b->out. */
XZ_EXTERN enum xz_ret XZ_FUNC xz_dec_lzma2_run(
struct xz_dec_lzma2 *s, struct xz_buf *b);
/* Free the memory allocated for the LZMA2 decoder. */
XZ_EXTERN void XZ_FUNC xz_dec_lzma2_end(struct xz_dec_lzma2 *s);
#ifdef XZ_DEC_BCJ
/*
* Allocate memory for BCJ decoders. xz_dec_bcj_reset() must be used before
* calling xz_dec_bcj_run().
*/
XZ_EXTERN struct xz_dec_bcj * XZ_FUNC xz_dec_bcj_create(bool single_call);
/*
* Decode the Filter ID of a BCJ filter. This implementation doesn't
* support custom start offsets, so no decoding of Filter Properties
* is needed. Returns XZ_OK if the given Filter ID is supported.
* Otherwise XZ_OPTIONS_ERROR is returned.
*/
XZ_EXTERN enum xz_ret XZ_FUNC xz_dec_bcj_reset(
struct xz_dec_bcj *s, uint8_t id);
/*
* Decode raw BCJ + LZMA2 stream. This must be used only if there actually is
* a BCJ filter in the chain. If the chain has only LZMA2, xz_dec_lzma2_run()
* must be called directly.
*/
XZ_EXTERN enum xz_ret XZ_FUNC xz_dec_bcj_run(struct xz_dec_bcj *s,
struct xz_dec_lzma2 *lzma2, struct xz_buf *b);
/* Free the memory allocated for the BCJ filters. */
#define xz_dec_bcj_end(s) kfree(s)
#endif
#endif

Some files were not shown because too many files have changed in this diff Show More