This provides an option for #223 without fully resolving it. (I think.)
Essentially, it acts very similar to the `gzip_static` and similar options for nginx, where it will check for the existence of pre-compressed files and serve those instead if the client allows it. I couldn't find a pre-existing way to actually parse the Accept-Encoding header properly (admittedly didn't look very hard) and just implemented one on my own that should be fine.
This should hopefully not have the same DOS vulnerabilities as #302, since it relies on the existing caching system. Compressed versions of files will be cached just like any other files, and that includes cache for missing files as well.
The compressed files will also be accessible directly, and this won't automatically decompress them. So, if you have a `tar.gz` file that you access directly, it will still be downloaded as the gzipped version, although you will now gain the option to download the `.tar` directly and decompress it in transit. (Which doesn't affect the server at all, just the client's way of interpreting it.)
----
One key thing this change also adds is a short-circuit when accessing directories: these always return 404 via the API, although they'd try the cache anyway and go through that route, which was kind of slow. Adding in the additional encodings, it's going to try for .gz, .br, and .zst files in the worst case as well, which feels wrong. So, instead, it just always falls back to the index-check behaviour if the path ends in a slash or is empty. (Which is implicitly just a slash.)
----
For testing, I set up this repo: https://codeberg.org/clarfonthey/testrepo
I ended up realising that LFS wasn't supported by default with `just dev`, so, it ended up working until I made sure the files on the repo *didn't* use LFS.
Assuming you've run `just dev`, you can go directly to this page in the browser here: https://clarfonthey.localhost.mock.directory:4430/testrepo/
And also you can try a few cURL commands:
```shell
curl https://clarfonthey.localhost.mock.directory:4430/testrepo/ --verbose --insecure
curl -H 'Accept-Encoding: gz' https://clarfonthey.localhost.mock.directory:4430/testrepo/ --verbose --insecure | gunzip -
curl -H 'Accept-Encoding: br' https://clarfonthey.localhost.mock.directory:4430/testrepo/ --verbose --insecure | brotli --decompress -
curl -H 'Accept-Encoding: zst' https://clarfonthey.localhost.mock.directory:4430/testrepo/ --verbose --insecure | zstd --decompress -
```
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/387
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: ltdk <usr@ltdk.xyz>
Co-committed-by: ltdk <usr@ltdk.xyz>
HTTP uses GMT [1,2] rather than UTC as timezone for timestamps. However,
the Last-Modified header used UTC which confused at least wget.
Before, UTC was used:
$ wget --no-check-certificate -S --spider https://cb_pages_tests.localhost.mock.directory:4430/images/827679288a.jpg
...
Last-Modified: Sun, 11 Sep 2022 08:37:42 UTC
...
Last-modified header invalid -- time-stamp ignored.
...
After, GMT is used:
$ wget --no-check-certificate -S --spider https://cb_pages_tests.localhost.mock.directory:4430/images/827679288a.jpg
...
Last-Modified: Sun, 11 Sep 2022 08:37:42 GMT
...
(no last-modified-header-invalid warning)
[1]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Last-Modified
[2]: https://www.rfc-editor.org/rfc/rfc9110#name-date-time-formatsFixes#364
---
Whatt I noticed is that the If-Modified-Since header isn't accepted (neither with GMT nor with UTC):
```
$ wget --header "If-Modified-Since: Sun, 11 Sep 2022 08:37:42 GMT" --no-check-certificate -S --spider https://cb_pages_tests.localhost.mock.directory:4430/images/827679288a.jpg
Spider mode enabled. Check if remote file exists.
--2024-07-15 23:31:41-- https://cb_pages_tests.localhost.mock.directory:4430/images/827679288a.jpg
Resolving cb_pages_tests.localhost.mock.directory (cb_pages_tests.localhost.mock.directory)... 127.0.0.1
Connecting to cb_pages_tests.localhost.mock.directory (cb_pages_tests.localhost.mock.directory)|127.0.0.1|:4430... connected.
WARNING: The certificate of ‘cb_pages_tests.localhost.mock.directory’ is not trusted.
WARNING: The certificate of ‘cb_pages_tests.localhost.mock.directory’ doesn't have a known issuer.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Allow: GET, HEAD, OPTIONS
Cache-Control: public, max-age=600
Content-Length: 124635
Content-Type: image/jpeg
Etag: "073af1960852e2a4ef446202c7974768b9881814"
Last-Modified: Sun, 11 Sep 2022 08:37:42 GMT
Referrer-Policy: strict-origin-when-cross-origin
Server: pages-server
Strict-Transport-Security: max-age=63072000; includeSubdomains; preload
Date: Mon, 15 Jul 2024 21:31:42 GMT
Length: 124635 (122K) [image/jpeg]
Remote file exists
```
I would have expected a 304 (Not Modified) rather than a 200 (OK). I assume this is simply not supported and on production 304 is returned by a caching proxy in front of pages-server.
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/365
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Peter Gerber <peter@arbitrary.ch>
Co-committed-by: Peter Gerber <peter@arbitrary.ch>
This PR renames `gitea` in cli args to `forge` and `GITEA` in environment variables to `FORGE` and adds the gitea names as aliases for the forge names.
Also closes#311
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/339
Remove leading slashes from captured portions of paths when
redirecting using splats. This makes a directive like
"/articles/* /posts/:splat 302" behave as described in
FEATURES.md, i.e. "/articles/foo" now redirects to
"/posts/foo" rather than to "/posts//foo". Fixes#269.
This also changes the behavior of a redirect like
"/articles/* /posts:splat 302". "/articles/foo" will now
redirect to "/postsfoo" rather than to "/posts/foo".
This change also fixes an issue where paths like
"/articles123" would be incorrectly matched by the above
patterns.
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/308
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Daniel Erat <dan@erat.org>
Co-committed-by: Daniel Erat <dan@erat.org>
This would yield to the error "forge client failed" instead of e.g. "404 Not Found". The issue was introduced in cbb2ce6d07.
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/306
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Moritz Marquardt <momar@noreply.codeberg.org>
Co-committed-by: Moritz Marquardt <momar@noreply.codeberg.org>
Move repetitive code from Options.matchRedirects into a new
Redirect.rewriteURL method and add a new test file.
No functional changes are intended; this is in preparation
for a later change to address #269.
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/304
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Daniel Erat <dan@erat.org>
Co-committed-by: Daniel Erat <dan@erat.org>
This PR add the `$NO_DNS_01` option (disabled by default) that removes the DNS ACME provider, and replaces the wildcard certificate by individual certificates obtained using the TLS ACME provider.
This option allows an instance to work without having to manage access tokens for the DNS provider. On the flip side, this means that a certificate can be requested for each subdomains. To limit the risk of DOS, the existence of the user/org corresponding to a subdomain is checked before requesting a cert, however, this limitation is not enough for an forge with a high number of users/orgs.
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/290
Reviewed-by: Moritz Marquardt <momar@noreply.codeberg.org>
Co-authored-by: Jean-Marie 'Histausse' Mineau <histausse@protonmail.com>
Co-committed-by: Jean-Marie 'Histausse' Mineau <histausse@protonmail.com>
Hello 👋
since it affected my deployment of the pages server I started to look into the problem of the blank pages and think I found a solution for it:
1. There is no check if the file response is empty, neither in cache retrieval nor in writing of a cache. Also the provided method for checking for empty responses had a bug.
2. I identified the redirect response to be the issue here. There is a cache write with the full cache key (e. g. rawContent/user/repo|branch|route/index.html) happening in the handling of the redirect response. But the written body here is empty. In the triggered request from the redirect response the server then finds a cache item to the key and serves the empty body. A quick fix is the check for empty file responses mentioned in 1.
3. The decision to redirect the user comes quite far down in the upstream function. Before that happens a lot of stuff that may not be important since after the redirect response comes a new request anyway. Also, I suspect that this causes the caching problem because there is a request to the forge server and its error handling with some recursions happening before. I propose to move two of the redirects before "Preparing"
4. The recursion in the upstream function makes it difficult to understand what is actually happening. I added some more logging to have an easier time with that.
5. I changed the default behaviour to append a trailing slash to the path to true. In my tested scenarios it happened anyway. This way there is no recursion happening before the redirect.
I am not developing in go frequently and rarely contribute to open source -> so feedback of all kind is appreciated
closes#164
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/292
Reviewed-by: 6543 <6543@obermui.de>
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Hoernschen <julian.hoernschemeyer@mailbox.org>
Co-committed-by: Hoernschen <julian.hoernschemeyer@mailbox.org>
A database bug in xorm.go prevents the pages-server from saving a
renewed certificate for a domain that already has one in the database.
Co-authored-by: crystal <crystal@noreply.codeberg.org>
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/209
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Crystal <crystal@noreply.codeberg.org>
Co-committed-by: Crystal <crystal@noreply.codeberg.org>
- Currently if the canonical domain validations fails(either for
legitimate reasons or for bug reasons like the request to Gitea/Forgejo
failing) it will use main domain certificate, which in the case for
custom domains will warrant a security error as the certificate isn't
issued to the custom domain.
- This patch handles this situation more gracefully and instead only
disallow obtaining a certificate if the domain validation fails, so in
the case that a certificate still exists it can still be used even if
the canonical domain validation fails. There's a small side effect,
legitimate users that remove domains from `.domain` will still be able
to use the removed domain(as long as the DNS records exists) as long as
the certificate currently hold by pages-server isn't expired.
- Given the increased usage in custom domains that are resulting in
errors, I think it ways more than the side effect.
- In order to future-proof against future slowdowns of instances, add a retry mechanism to the domain validation function, such that it's more likely to succeed even if the instance is not responding.
- Refactor the code a bit and add some comments.
Co-authored-by: Gusted <postmaster@gusted.xyz>
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/160
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Gusted <gusted@noreply.codeberg.org>
Co-committed-by: Gusted <gusted@noreply.codeberg.org>
- Currently any error generated by requesting the `.domains` file of a repository would be logged under the info log level, which isn't the correct log level when we exclude the not found error.
- Use warn log level if the error isn't the not found error.
Co-authored-by: Gusted <postmaster@gusted.xyz>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/162
Reviewed-by: Otto <otto@codeberg.org>
Added new TockenBucket named `acmeClientFailLimit` to avoid being banned because of the [Failed validation limit](https://letsencrypt.org/docs/failed-validation-limit/) of Let's Encrypt.
The behaviour is similar to the other limiters blocking the `obtainCert` func ensuring rate under limit.
Co-authored-by: fsologureng <sologuren@estudiohum.cl>
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/151
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Felipe Leopoldo Sologuren Gutiérrez <fsologureng@noreply.codeberg.org>
Co-committed-by: Felipe Leopoldo Sologuren Gutiérrez <fsologureng@noreply.codeberg.org>
- It's not guaranteed that `tls.X509KeyPair` will set `c.Leaf`.
- This patch fixes this by using a wrapper that parses the leaf
certificate(in bytes) if `c.Leaf` wasn't set.
- Resolves#149
Co-authored-by: Gusted <postmaster@gusted.xyz>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/150
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Gusted <gusted@noreply.codeberg.org>
Co-committed-by: Gusted <gusted@noreply.codeberg.org>
If no repository is found the user expects a 404 status code
instead of a dependency failed status code (as it was before).
Signed-off-by: Jan Klippel <c0d3b3rg@kl1pp3l.de>
Fixes: https://codeberg.org/Codeberg/Community/issues/809
Co-authored-by: Jan Klippel <c0d3b3rg@kl1pp3l.de>
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/141
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: jklippel <jklippel@noreply.codeberg.org>
Co-committed-by: jklippel <jklippel@noreply.codeberg.org>
As per [the documentation](https://pkg.go.dev/net/http#Serve), it doesn't enable HTTP2 by-default, unless we enable it via the `NextProtos` option.
Co-authored-by: Gusted <williamzijl7@hotmail.com>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/137
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Gusted <gusted@noreply.codeberg.org>
Co-committed-by: Gusted <gusted@noreply.codeberg.org>
we have big functions that handle all stuff ... we should split this into smaler chuncks so we could test them seperate and make clear cuts in what happens where
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/135
- For production(*cough* Codeberg *cough*), it's important to not use
mock certs. So fail right from the start if this is the case and not try
to "handle it gracefully", as it would break production.
- Resolves#131
CC @6543
Co-authored-by: Gusted <williamzijl7@hotmail.com>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/133
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Gusted <gusted@noreply.codeberg.org>
Co-committed-by: Gusted <gusted@noreply.codeberg.org>