Tool Box Archives - WOVIST

Box Tool Archives

Box Tool Archives

Archives are treasure troves of evidence on a wide range of topics. They are used by historians, specialists, and the wider public to learn. In Outlook 2013: Click File > Info > Cleanup Tools > Archive. Check the Include items with “Do not AutoArchive” checked box to archive individual items. Archive Box is written in Python and runs on Linux and Mac OS. It makes use of the native Linux/Mac programs like curl and wget to grab a lot of data so unlike. Box Tool Archives
This is the last day of "31 Days of Tips from an Archivist"! I have had so much fun sharing tips and advice about researching in archives and especially records preservation.

If Box Tool Archives have enjoyed my blog post during the month of October, be sure to subscribe to my blog in a Revit 2015 crack serial keygen or by email so you can receive my posts once a week.

In this post I am going to give you a list of tools that you as a genealogist and home archivist should purchase and have on hand in your "Home Archivist Tool Box" so you are ready to preserve your genealogical records, photographs and artifacts.

First, you will need a tool box! Yep, I recommend an actual tool box to keep all your materials in so they are all in one place and don't get lost. I have a tool box just like this one that I use in the Houston County, TN. Archive to put all my archival tools in so I can access them when I am working on records preservation:

Tool Box

The first item to put in your tool box are soft #2 pencils. In the archives, we never use ball point pens on documents or photographs. We always use soft #2 pencils to identify photographs and to source documents:

Soft #2 Pencils

If you find that you have a document or photograph that pencil will not write on, you can use a pen called an Identi Pen. It is preferred that pencil is used all the time but there are some cases where pencil will not show up and this Identi Pen can be used.

Identi Pen

Next, you should obtain some soft bristle brushes. These will be used to brush off dust, dirt and other specs of grime that could be inside of books, scrapbooks and on your documents, Box Tool Archives. I recommend getting make-up brushes, they are fairly cheap and work very well:

Make-Up Brushes

Next, every Home Archivist Tool Kit should have a micro spatula! This tool is used for many jobs in an archives like Box Tool Archives staples. I consider this an essential tool for the home archivist:

Micro Spatula

Gloves! The next item to put in your tool kit are gloves, Box Tool Archives. You can get white cotton gloves or nitrile gloves. These need to be used for handling photographs because the dirt and oils on your hands can damage photographs. There is not need to use gloves when handling documents, nice clean hands will do the trick!

Cotton Gloves 

Another must have for the tool kit is a specialized Dry Cleaning Sponge. This sponge can be used to clean dirty or soot covered documents. This sponge could remove some stains and dirty spots on documents and is great to clean up dusty and dirty documents. CAUTION:do not use on pencil writing! This sponge will erase the pencil writing.

Soot and Dirt Cleaning Sponge

And lastly, Document Repair Tape. In the archives, we don't use "tape" on anything. But there is a particular type of repair tape that is acceptable if used sparingly. If you have small tears or rips in your documents, this repair tape is perfectly fine to use. Just be sure to place the tape on the back of the document where there is no writing. 

Document Repair Tape

If you want to see me talking about the "Home Archivist Tool Kit", you can watch my guest appearance on Dear Myrtle's Wacky Wednesday hangout, Box Tool Archives. It is free to watch and there is much more information talked about on each item covered in this blog post:

Источник: []

One of the biggest challenges of Box Tool Archives investigation is preserving data once you’ve found it. We have access to more information than ever before, but so much of it can be easily lost if we don’t take steps Wondershare PDFelement Professional Serial key - archive it. If you’ve ever bookmarked an important resource only to come back later and see that it’s no longer available, you’ll know how frustrating it can be. I wrote about this problem last year in this post Box Tool Archives the Attrition of Information In OSINT along with some suggestions about how to preserve internet material as well as how to recover data when it has been removed.

The Internet Archive is probably the most familiar tool for preserving web pages but it is not without its limitations. It can’t capture Facebook pages for instance, and even if you instruct it to begin archiving a site then it can easily fail if that site’s robots.txt prevents crawling. The Box Tool Archives use of Javascript and embedded video content also makes scraping and archiving webpages more difficult, Box Tool Archives. The preserved site you find on the Internet Archive is often missing much of the original content and features.

To counter this it is necessary to use several types of tool to preserve web content for your investigations rather than just relying on one, Box Tool Archives. Hunchly is excellent for capturing web pages, but I still like to supplement it with YouTube-dl for grabbing video content. Recently I’ve also started using Archive Box to build offline archives of web content that I want to keep. It wasn’t designed with OSINT work in mind but it is perfectly suited to the task of preserving and archiving web pages in multiple formats, Box Tool Archives, including JavaScript-based websites and PDF/PNG screenshots. Video and audio content can also been downloaded and preserved.

Archive Box can build full archives of the websites listed in your bookmarks, browser history, or from a list of custom URLs that you provide. In the rest of this post I’ll show you how you can set up and install Archive Box and start to archive your own pages.

Setting Up

Archive Box is written in Python and runs on Linux and Mac OS. It makes use of the native Linux/Mac programs like curl and wget to grab a lot of data so unlike many other Python tools it won’t run in Windows. If you want to use Archive Box in a Windows environment then you’ll need to install and run it with Docker as per these instructions here.

The latest Rubymine patch Archives (0.4.21) of Archive Box is available via Pypi, so that’s what we’ll install in this guide. It requires Python 3.7 or higher to run. I use Linux or MacOS for most tools like this but Archive Box will also run on Windows provided that you have already installed Python/Pip.

To check your current version of Python 3 enter the console and type:

If the version is less than 3.7, Box Tool Archives, you’ll need to install a more up to date version of Python.

Once you’ve installed Python 3.7 (or higher), you can install Archive Box directly from PyPi with the following command:

If you’re unfamiliar with Python and Pip, have a read of this post I wrote last year. If you’re using MacOS you can install Archive Box with Brew:

There’s also a Docker image available for Archive WinThruster 1.80 FULL CRACK Archives which means you can also run it on Windows, you’ll just need to set up Docker first. These days I prefer to use Docker images for OSINT tools but that’s for a future blog post.

Next you need to create a directory where your archive will be stored and complete the Archive Box setup there:

$ mkdir myarchive && cd myarchive $ archivebox init

Once installation has finished, you’ll be ready to start building your archive.

Basic Usage

All commands take the following format:

To archive a single webpage, Box Tool Archives, use the following command:

$ archivebox add ''

It’s also possible to add recursion to your request, so not only do you archive the page you specify, but Archive Box will also follow every link on the page and archive that too. The greater the depth, the further it will follow the links. Recursion can be added with the following option:

$ archivebox add '' --depth=1

This will now archive the Box Tool Archives and follow all the links within it to a depth of 1, and then archive all those pages too.

Viewing The Archive

Here’s the start of my new archive:

To view your archive, Box Tool Archives, open your browser and navigate to the index.html file in archive folder you created. It’ll be something like /home/username/myarchive/index.html. The archive records the time you created it, the link that was saved, and the original URL. Clicking on “Files” will show just how powerful Archive Box is:

The front page of my website has been saved as an offline local archive (complete with all necessary JavaScript so the appearance is identical to the live version), as pure HTML/CSS, as a PDF, a PNG screenshot, and you’ll also notice that Archive Box has even archived a copy on the WayBack Machine too. So now I have a full working archive of my site saved locally on my machine. This is a much better way to preserve a webpage than with simple screenshots, and even if the original site were to disappear (I hope not) I’d still have Windows Loader 3.1 Download For 32Bit- 64Bit [New 2021] full offline copy to work with.

Archiving Multiple Websites

An archive with one site in it isn’t much fun. Fortunately Archive Box also makes it easy to archive multiple sites at once, either from a list of URLs, or from your browser’s saved bookmarks. To archive multiple websites, create a text file like this, with one URL on each line.

Then we enter the following command (assuming your URL list is in the same directory as your archive):

After a few minutes, all the listed websites have been added to my offline archive in the same range of formats as before:

The archive of the BBC Football page shows the advantage of saving in multiple formats. The site features a lot of custom video streams that can’t really be archived offline, so the local archive looks a little odd:

Despite this the fact that PDF and PNG versions of the site are also created means we can still see what the site was like at the time it was archived. You’ll also notice a limitation of the Wayback Machine that I mentioned earlier. If a site doesn’t want to be crawled by the Wayback Machine, the only thing that will be preserved is a 301 error. Archiving in multiple formats means that the chances of material being lost is significantly reduced.

Video Content

Archive Box uses YouTube-dl so that it can archive Box Tool Archives content too. Let’s say that you want to add this OSINTCurious Ten Minute Tip to your archive. You can run the following command:

The entire 10 Minute Tip will now saved to your archive, including the video and audio files.

To access the archived video/audio, click on the Box Tool Archives link on the right. You’ll see that the video, audio and thumbnail content have all been archived and preserved offline:

Archiving Your Bookmarks

Archive Box also allows you to create archives of websites saved in your bookmarks. Simply export a list of bookmarks from your browser (see instructions here for Chrome and here for Firefox) as an HTML file and point Archive Box at it:


Being able to capture and preserve web content is a core skill for OSINT investigators, Box Tool Archives. There are several technical challenges that make this difficult but Archive Box is a very effective way of gathering and preserving the information that you need.

Archive Box is in active development and it continues to receive new features and updates, so some elements of this post might become outdated in time. Follow @ArchiveBoxApp on Twitter for the latest updates.

Источник: []


Input Formats

ArchiveBox supports many input formats for URLs, including Pocket & Pinboard exports, Browser bookmarks, Browser history, plain text, HTML, markdown, Quicktime 7.1.3 Pro crack serial keygen more!

Click these links for instructions Box Tool Archives how to prepare your links from these sources:

  • TXT, Box Tool Archives, RSS, XML, JSON, CSV, SQL, HTML, Markdown, or any other text-based format…
  • Browser history or browser bookmarks (see instructions for: Chrome, Firefox, Safari, IE, Opera, and more…)
  • Pocket, Pinboard, Instapaper, Shaarli, Delicious, Reddit Saved, Wallabag,, OneTab, and more…

See the Usage: CLI page for documentation and examples.

It also includes a built-in scheduled import feature with Box Tool Archives and browser bookmarklet, so you can pull in URLs from RSS feeds, websites, or the filesystem regularly/on-demand.

Output Formats

Inside each Snapshot folder, ArchiveBox save these different types of extractor outputs as plain files:

  • Index: & HTML and JSON index files containing metadata and details
  • Title, Favicon, Headers Response headers, site favicon, and Box Tool Archives site title
  • SingleFile: HTML snapshot rendered with headless Chrome using SingleFile
  • Wget Clone: wget clone of the site with
  • Chrome Headless
    • PDF: Printed PDF of site using headless chrome
    • Screenshot: 1440x900 screenshot of site using headless chrome
    • DOM Dump: DOM Dump of the HTML after rendering using headless chrome
  • Article Text: Article text extraction using Readability & Mercury
  • Permalink: A link to the saved site on
  • Audio & Video: all audio/video files Box Tool Archives playlists, including subtitles & metadata with youtube-dl
  • Source Code: clone of any repository found on GitHub, Bitbucket, or GitLab links
  • More coming soon! See the Roadmap…

It does everything out-of-the-box by default, but you can disable or tweak individual archive methods via environment variables / config.


ArchiveBox can be configured via environment variables, Box Tool Archives using the CLI, or by editing the config file directly.

These methods also work the same way when run inside Docker, see the Docker Configuration wiki page for details.

The config loading logic with all the options defined is here: .

Most options are also documented on the Configuration Wiki page.

Most Common Options to Tweak


For better security, Box Tool Archives, easier updating, Box Tool Archives, and to avoid polluting your host system with extra dependencies, it is strongly recommended to use the official Docker image with everything pre-installed for the best experience.

To achieve high fidelity archives in as many situations as possible, ArchiveBox depends on Box Tool Archives variety of 3rd-party tools and libraries that specialize in extracting different types of content. These optional dependencies used for archiving sites include:

  • / (for screenshots, PDF, DOM HTML, and headless JS scripts)
  • & (for readability, mercury, and singlefile)
  • (for plain HTML, static files, and WARC saving)
  • (for fetching headers, favicon, and posting to
  • (for audio, video, and subtitles)
  • (for cloning git repos)
  • and more as we grow…

You don’t need to install every dependency to use ArchiveBox. ArchiveBox will automatically disable extractors that rely on dependencies that aren’t installed, based on what is configured and available in your .

If not using Docker, make sure to keep the dependencies up-to-date yourself and check that ArchiveBox isn’t reporting any incompatibility with the versions you install.

Installing directly on Windows without Docker or WSL/WSL2/Cygwin is not officially supported, but some advanced users have reported getting it working.

Archive Layout

All of ArchiveBox’s state (including the index, snapshot data, and config file) is stored in a single folder called the “ArchiveBox data folder”. All CLI commands must be run from inside this folder, and you first create it by running .

The on-disk layout is optimized to be easy to browse by hand and durable long-term. The main index is a standard database in the root of the data folder (it can also be exported as static JSON/HTML), Box Tool Archives, and the archive snapshots are organized by date-added timestamp in the subfolder.

Each snapshot subfolder includes a static and describing its contents, and the snapshot extractor outputs are plain files within the folder.

Static Archive Exporting

You can export the main index to browse it statically without needing to run a server.

Note about large exports: These exports are not paginated, exporting many URLs Box Tool Archives the entire archive at once may be slow. Use the filtering CLI flags 007 Spy Software 3.32 crack serial keygen the command to export specific Snapshots or ranges.

The paths in the static exports are relative, make sure to keep them next to your folder when backing them up or viewing them.

security graphic


Archiving Private Content

If you’re importing pages with private content or URLs containing secret tokens you don’t want public (e.g Google Docs, paywalled content, unlisted videos, etc.), you may want to disable some of the extractor methods to avoid leaking that content to 3rd party APIs or the public.

Security Risks of Viewing Archived JS

Be aware that malicious archived JS can access the contents of other pages in your archive when viewed, Box Tool Archives. Because the Web UI serves all viewed snapshots from a single domain, they share a request context and typical CSRF/CORS/XSS/CSP protections do not work to prevent cross-site request attacks. See the Security Overview page and Issue #239 for more details.

The admin UI is also served from the same origin as replayed JS, so malicious pages could also potentially use your ArchiveBox login cookies to perform admin actions (e.g. adding/removing links, running extractors, etc.). We are planning to fix this security shortcoming in a future version by using separate ports/origins to serve the Admin UI and archived content (see Issue #239).

Note: Only the extractor method executes archived JS when viewing snapshots, all other archive methods produce static output that does not execute JS on viewing. If you are worried about these issues ^ Box Tool Archives should disable the wget extractor method using .

Saving Multiple Snapshots of a Single URL

First-class support for saving multiple snapshots of each site over time will be added eventually (along TeamViewer 15.10.5 Crack Archives the ability to view diffs of the changes between runs). For now ArchiveBox is designed to only archive each unique URL with each extractor type once. The workaround to take multiple snapshots of the same URL is to make them slightly different by adding a hash:

The Re-Snapshot Button button in the Admin UI is a shortcut for this hash-date workaround.

Storage Requirements

Because ArchiveBox is designed to ingest a firehose of browser history and bookmark feeds to a local disk, it can be much more disk-space intensive than a centralized service like the Internet Archive or ArchiveBox can use anywhere from ~1gb per 1000 articles, to ~50gb per 1000 articles, mostly dependent on whether you’re saving audio & video using and whether you lower .

Disk usage can be reduced by using a compressed/deduplicated filesystem like ZFS/BTRFS, or by turning off extractors methods you don’t need. Don’t store large collections on older filesystems like EXT3/FAT as they may not be able to handle more than 50k directory entries in the folder. Try to keep the file on local drive (not a network mount) or SSD for maximum performance, however the folder can be on a network mount or spinning HDD.


paisley graphic

The aim of ArchiveBox is to enable more Box Tool Archives the internet to be archived by empowering people to self-host their own archives. The intent is for all the web content you care about to be viewable with common software in 50 - 100 years without needing to run ArchiveBox or other specialized software to replay it.

Vast treasure troves of knowledge are lost every day on the internet to link rot. As a society, we have an imperative to preserve some important parts of that treasure, just like we preserve our books, paintings, and music in physical libraries long after the originals go out of print or fade into obscurity.

Whether it’s to resist censorship by saving articles before they get taken down or edited, or just to save a collection of early 2010’s flash games you love to play, having the tools to archive internet content enables to you save the stuff you care most about before it disappears.

The balance between the permanence and ephemeral nature of content on the internet is part of what makes it beautiful. I don’t think everything should be preserved in an automated fashion–making all content permanent and never removable, but I do think people should be able to decide for themselves and effectively archive specific content that they care about.

Because modern websites are complicated and often rely on dynamic content, ArchiveBox archives the sites in several different formats beyond what public archiving services like save. Using multiple methods and the market-dominant browser to execute JS ensures we can save even the most complex, Box Tool Archives, finicky websites in at least a few high-quality, long-term data formats.

Comparison to Other Projects


Check out our community page for an index of web archiving initiatives and projects.

A variety of open and closed-source archiving projects Box Tool Archives, but few provide a nice UI and CLI to manage a large, high-fidelity archive collection over time.

ArchiveBox tries to be a robust, set-and-forget archiving solution suitable for archiving RSS feeds, bookmarks, or your entire browsing history (beware, it may be too big to store), (this is not recommended due to JS replay security concerns).

Comparison With Centralized Public Archives

Not all content is suitable to be archived in a centralized collection, whether because it’s private, copyrighted, too large, or too complex. ArchiveBox hopes to fill that gap.

By having each user store their own content locally, we can save much larger portions of everyone’s browsing history than a shared centralized service would be able to handle. The eventual goal is to work towards federated archiving where users can share portions of their collections with each other.

Comparison With Other Self-Hosted Archiving Options

ArchiveBox differentiates itself from similar self-hosted projects by providing both a comprehensive CLI interface for managing your archive, a Web UI that can be used either independently or together with the CLI, and a simple on-disk data format that can be used without either.

ArchiveBox is neither the highest fidelity, nor the simplest tool available for self-hosted archiving, rather it’s a jack-of-all-trades that tries to do most things well by default. It can be as simple or advanced as you want, and is designed to do everything out-of-the-box but be tuned to suit your needs.

If you want better fidelity for very complex interactive pages with heavy JS/streams/API requests, check out and

If you want more bookmark categorization and note-taking features, check out Archivy, Memex, Polar, or LinkAce.

If you need more advanced recursive spider/crawling ability beyondcheck out Browsertrix, Photon, or Scrapy and pipe the outputted URLs into ArchiveBox.

For more alternatives, see our list here…

dependencies graphic

Internet Archiving Ecosystem

Whether you want to learn which organizations are the big players in the web archiving space, want to find a specific open-source tool for your web archiving need, or just want to see where archivists hang out online, our Community Wiki page serves as an index of the broader web archiving community. Check it out to learn about some of the coolest web archiving projects and communities on the web!

Need help building a custom archiving solution?

Hire the team that helps build Archivebox to work on your project. (@MonadicalSAS)

(They also do general software consulting across many industries)

documentation graphic

We use the GitHub wiki system and Read the Docs (WIP) for documentation.

You can also access Halo Infinite torrent download pc free Archives docs locally by looking in the folder.

Getting Started


More Info


All contributions to ArchiveBox are welcomed! Check our issues and Roadmap for things to work on, and please open an issue to discuss your proposed implementation before working on things! Otherwise we may have to close your PR if it doesn’t align with our roadmap.

Low hanging fruit / easy first tickets:
Total alerts

Setup the dev environment

Common development tasks

See the folder and read the source of the bash scripts within. You can also run all these in Docker. For more examples see the GitHub Actions CI/CD tests that are run: .

Run in DEBUG mode

Install and run a specific GitHub branch

Run the linters

Run the integration tests

Make migrations or enter a django shell

this project by ArchiveBox can be found on GitHub

Generated with GitHub Pages using Merlot

Источник: []

Your Outlook mailbox is only so big, and it’s a good bet that you won’t stop getting email anytime soon. To keep it from filling up, you can move old items you want to keep to an archive, a separate Outlook Data File (.pst) that you can open from Outlook any time you need it.

Note: The Archive command and feature doesn’t appear for any account in your Outlook profile if you include an Exchange Server account and your organization uses Microsoft Exchange Server Online Archive. Your network administrator can also disable this feature.

By default, Outlook uses AutoArchive to archive items at a regular interval. To learn more, see Archive older items automatically.

You can also archive items manually whenever you want. That way, you can control which items to archive, where to store them, and how old an item needs to be before it can be archived.

  1. Do one of the following:

    • In Outlook 2013: Click File > Info > Cleanup Tools > Archive.


    • In Outlook 2016: Click File > Info > Tools > Clean up old items

      Clean up old items

    Tip: Archive and AutoArchive Box Tool Archives not be available if your mail profile connects to an Exchange Server. It's also possible that your organization has a mail retention policy that overrides AutoArchive. Check with your system administrator for more information.

  2. Click the Archive this folder and all subfolders option, and choose the folder you want to archive.

  3. Under Archive items older than, enter a date.

    Archive dialog box

  4. You can create multiple .pst files if you want to archive some folders using different settings. For example, you may want to keep items in your Sent Box Tool Archives longer than items in your Inbox folder.

  5. Check the Include items with “Do not AutoArchive” checked box to archive individual items that are excluded from automatic archiving. This option doesn't remove that exclusion from these items, but instead ignores the Do not AutoArchive Universal Document Converter free download Archives for this archive only.

  6. Click OK.

Turn off AutoArchive

To archive only when you want, turn off AutoArchive.

  1. Click File > Options > Advanced.

  2. Under AutoArchive, click AutoArchive Settings.

  3. Uncheck the Run AutoArchive every n days box.

Important: Office 2010 is no longer supported, Box Tool Archives. Upgrade to Microsoft 365 to work anywhere from any device and continue to receive support.

Upgrade now

By default, older Outlook items archived automatically on a regular interval. To learn more about AutoArchive, see Use AutoArchive to back up or delete items.

You can also manually back up and archive items, in addition to AutoArchive or as a replacement. Manual archiving provides flexibility, and allows you to specify exactly which folders are included in the archive, and which archive Outlook Data File (.pst) is used.

To manually archive Outlook items, do the following:

  1. Click the File tab

  2. Click Cleanup Tools.

  3. Click Archive.

  4. Click the Archive this folder and all subfolders option, and then click the folder that you want to archive. Any subfolder of the folder you select is included in this manual archive.

  5. Under Archive items older than, enter Box Tool Archives date.

    Archive dialog box

  6. If you do not want to use the default file or location, under Archive file, click Browse to specify a new file or location. Browse to find the file that you want, or enter the file name, then click OK, Box Tool Archives. The destination file location appears in the Archive file box.

  7. Select the Include items with “Do not AutoArchive” checked check box to include any items that might be individually marked to be excluded from automatic archiving. This option does not remove that exclusion from these items, but instead ignores the Do not AutoArchive check box for this archive only.

Turn off AutoArchive

If you want to archive only manually, you must turn off AutoArchive. Do the following:

  1. Click the File tab.

  2. Click Options.

  3. On the Advanced tab, under AutoArchive, click AutoArchive Settings.

  4. Clear the Run AutoArchive every n days check box.

Important: Office 2007 is no longer supported. Upgrade Box Tool Archives Microsoft 365 to work anywhere from any device and continue to receive support.

Upgrade now

AutoArchive, which is turned on by default, automatically moves old items to an archive location at scheduled intervals. However, you can manually back up and archive items to a location that you specify.

Note: The Microsoft Office Outlook 2007 AutoArchive settings are customizable. Rather than backing up or archiving your items manually, you Box Tool Archives find that AutoArchive can meet your needs, Box Tool Archives. For more information, see Using AutoArchive to back up or delete items.

  1. On the File menu, click Archive.

  2. Select the Archive this folder and all subfolders option, and then specify a date under Archive items older than.

    Archived folders in the folder list

  3. Under Archive file, click Browse to specify a new file or location if you do not want to use the default file or location.

  4. Select the Include items with "Do not AutoArchive" checked check box if you want to override a previous setting to not automatically Office 2019 KMS Activator Ultimate 1.4 Full Free Download 2021 specific items. If you choose to manually archive these items during this procedure, the items will again be subject to the Do not AutoArchive setting unless you manually override that setting again in the future.

  5. Click OK.

Note: Outlook automatically creates another archive file for items in the folder and location specified.

Источник: []


🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, Box Tool Archives, PDFs, media, and more.

ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view sites you want to preserve offline.

You can set it up as a command-line tool, web app, and desktop app (alpha), on Linux, macOS, and Windows.

You can feed it URLs one at a time, Box Tool Archives schedule regular imports from browser bookmarks or history, feeds like RSS, bookmark services like Pocket/Pinboard, and more. See input formats for a full list.

It saves snapshots of the URLs you feed Box Tool Archives in several formats: HTML, Box Tool Archives, PDF, PNG screenshots, WARC, and more out-of-the-box, with a wide variety of content extracted and preserved automatically (article text, audio/video, git repos, etc.). See output formats for a Box Tool Archives list.

The goal is to sleep soundly knowing the part of the internet you care about will be automatically preserved in durable, easily accessible formats for decades after it goes down.

📦  Get ArchiveBox with Docker / / / / etc, Box Tool Archives. (see Quickstart below).

🔢 Example usage: adding links to archive.

🔢 Example usage: viewing the archived content.

Key Features

  • Free & open source, doesn’t require signing up for anything, stores all data locally
  • Powerful, intuitive command line interface with modular optional dependencies
  • Comprehensive documentation, Box Tool Archives, active development, and rich community
  • Extracts a wide variety of content out-of-the-box: media (youtube-dl), articles (readability), code (git), etc.
  • Supports scheduled/realtime importing from many types of sources
  • Uses standard, durable, long-term formats like HTML, Box Tool Archives, JSON, PDF, PNG, and WARC
  • Usable as a oneshot CLI, Box Tool Archives, self-hosted web UI, Python API (BETA), REST API (ALPHA), or desktop app (ALPHA)
  • Saves Box Tool Archives pages to as well by default for redundancy (can be disabled for local-only mode)
  • Planned: support for archiving content requiring a login/paywall/cookies (working, but ill-advised until some pending fixes are released)
  • Planned: support for running JS during archiving to adblock, autoscroll, modal-hide, thread-expand…


🖥  Supported OSs: Linux/BSD, macOS, Windows (Docker/WSL)   👾  CPUs: amd64, x86, arm8, arm7 (raspi>=3)

✳️  Easy Setup

🛠  Package Manager Setup

🎗  Other Options

➡️  Next Steps


⚡️  CLI Usage

  • to administer your collection
  • to manage Snapshots in the archive
  • to pull in fresh URLs in regularly from boorkmarks/history/Pocket/Pinboard/RSS/etc.

🖥  Web UI Usage

🗄  SQL/Python/Filesystem Usage


. .Box Tool Archives. .Box Tool Archives. .


Notice: Undefined variable: z_bot in /sites/ on line 99

Notice: Undefined variable: z_empty in /sites/ on line 99


Leave a Reply

Your email address will not be published. Required fields are marked *