MCPcopy
hub / github.com/gabriel-vasile/mimetype

github.com/gabriel-vasile/mimetype @v1.4.13 sqlite

repository ↗ · DeepWiki ↗ · release v1.4.13 ↗
405 symbols 1,000 edges 38 files 282 documented · 70%
README

mimetype

A package for detecting MIME types and extensions based on magic numbers

Goroutine safe, extensible, no C bindings

Go Reference Go report card License

Features

Install

go get github.com/gabriel-vasile/mimetype

Usage

mtype := mimetype.Detect([]byte)
// OR
mtype, err := mimetype.DetectReader(io.Reader)
// OR
mtype, err := mimetype.DetectFile("/path/to/file")
fmt.Println(mtype.String(), mtype.Extension())

See the runnable Go Playground examples.

Caution: only use libraries like mimetype as a last resort. Content type detection using magic numbers is slow, inaccurate, and non-standard. Most of the times protocols have methods for specifying such metadata; e.g., Content-Type header in HTTP and SMTP.

FAQ

Q: My file is in the list of supported MIME types but it is not correctly detected. What should I do?

A: Some file formats (often Microsoft Office documents) keep their signatures towards the end of the file. Try increasing the number of bytes used for detection with:

mimetype.SetLimit(1024*1024) // Set limit to 1MB.
// or
mimetype.SetLimit(0) // No limit, whole file content used.
mimetype.DetectFile("file.doc")

If increasing the limit does not help, please open an issue.

Tests

In addition to unit tests, mimetype_tests compares the library with libmagic for around 50 000 sample files. Check the latest comparison results here.

Benchmarks

Benchmarks are performed when a PR is open. The results can be seen on the workflows page. Performance improvements are welcome but correctness is prioritized.

Structure

mimetype uses a hierarchical structure to keep the MIME type detection logic. This reduces the number of calls needed for detecting the file type. The reason behind this choice is that there are file formats used as containers for other file formats. For example, Microsoft Office files are just zip archives, containing specific metadata files. Once a file has been identified as a zip, there is no need to check if it is a text file, but it is worth checking if it is an Microsoft Office file.

To prevent loading entire files into memory, when detecting from a reader or from a file mimetype limits itself to reading only the header of the input.

how project is structured

Contributing

Contributions are never expected but very much welcome. mimetype_tests shows which file formats are most often misidentified and can help prioritise. When submitting a PR for detection of a new file format, please make sure to add a record to the list of testcases in mimetype_test.go. For complex files a record can be added in the testdata directory.

Extension points exported contracts — how you extend this code

Detector (FuncType)
Detector receiveѕ the raw data of a file and returns whether the data meets any conditions. The limit parameter is an up
internal/magic/magic.go

Core symbols most depended-on inside this repo

newMIME
called by 200
mime.go
alias
called by 52
mime.go
Advance
called by 29
internal/scan/bytes.go
ByteIsWS
called by 21
internal/scan/bytes.go
Peek
called by 20
internal/scan/bytes.go
String
called by 19
mime.go
offset
called by 19
internal/magic/magic.go
Detect
called by 15
mimetype.go

Shape

Function 349
Method 41
Struct 10
TypeAlias 4
FuncType 1

Languages

Go100%

Modules by API surface

internal/magic/text.go44 symbols
internal/magic/image.go26 symbols
mimetype_test.go24 symbols
internal/magic/zip.go23 symbols
internal/magic/binary.go21 symbols
internal/magic/archive.go21 symbols
internal/scan/bytes.go19 symbols
internal/json/parser_test.go18 symbols
internal/json/parser.go18 symbols
internal/scan/bytes_test.go15 symbols
internal/magic/ftyp.go15 symbols
mime.go14 symbols

For agents

$ claude mcp add mimetype \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact