- The parser of `git grep`'s output uses `bufio.Scanner`, which is a good
choice overall, however it does have a limit that's usually not noticed,
it will not read more than `64 * 1024` bytes at once which can be hit in
practical scenarios.
- Use `bufio.Reader` instead which doesn't have this limitation, but is
a bit harder to work with as it's a more lower level primitive.
- Adds unit test.
- Resolves https://codeberg.org/forgejo/forgejo/issues/3149
(cherry picked from commit 668709a33f)
**Backport:** https://codeberg.org/forgejo/forgejo/pulls/2906
Following #2763 (refactor of git check-attr)
and #2866 (wrong log.Error format in check-attr)
- refactors the `nul-byte` reader to be used in both the streaming and one-off cases.
- add test for some failure cases
- don't log the error returned by `cmd.Run`, but return it to the `CheckPath` caller (which can then decide what to do with it).
This should solve the following flaky `log.Error` (or at least move it to the caller, instead of being inside a random goroutine):
https://codeberg.org/forgejo/forgejo/actions/runs/9541/jobs/5#jobstep-7-839
> FATAL ERROR: log.Error has been called: 2024/03/28 14:30:33 ...it/repo_attribute.go:313:func2() [E] Unable to open checker for 3fa2f829675543ecfc16b2891aebe8bf0608a8f4. Error: failed to run attr-check. Error: exit status 128
Stderr: fatal: this operation must be run in a work tree
Co-authored-by: oliverpool <git@olivier.pfad.fr>
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/2939
Reviewed-by: oliverpool <oliverpool@noreply.codeberg.org>
Co-authored-by: forgejo-backport-action <forgejo-backport-action@noreply.codeberg.org>
Co-committed-by: forgejo-backport-action <forgejo-backport-action@noreply.codeberg.org>
- `%w` is to wrap errors, but can only be used by `fmt.Errorf`. Instead
use `%v` to display the error.
- Regression of #2763
Before:
[E] failed to run attr-check. Error: %!w(*exec.ExitError=&{0xc006568e28 []})
Stderr: fatal: this operation must be run in a work tree
After:
[E] failed to run attr-check. Error: exit status 128
Stderr: fatal: this operation must be run in a work tree
`CommitGPGSignature` was originally made to store information about a
commit's GPG signature. Nowadays, it is used to store information about
SSH signatures too, and not just commit signatures, but tag signatures
too.
As such, rename it to `ObjectSignature`, because that describes what it
does a whole lot better.
Signed-off-by: Gergely Nagy <forgejo@gergo.csillger.hu>
Just like commits, tags can be signed with either an OpenPGP, or with an
SSH key. While the latter is supported already, SSH-signed tags have not
been. This patch teaches the git module to recognize and handle
SSH-signed tags too.
This will stop the signatures appearing in release notes, but are
currently unused otherwise.
Signed-off-by: Gergely Nagy <forgejo@gergo.csillger.hu>
Most time, when invoking `git.OpenRepository`, `objectFormat` will not
be used, so it's a waste to invoke commandline to get the object format.
This PR make it a lazy operation, only invoke that when necessary.
(cherry picked from commit e84e5db6de0306d514b1f1a9657931fb7197a188)
(cherry picked from commit 25b842df261452a29570ba89ffc3a4842d73f68c)
Conflicts:
routers/web/repo/wiki.go
services/repository/branch.go
services/repository/migrate.go
services/wiki/wiki.go
also apply to Forgejo specific usage of the refactored functions
Close#29509
Windows, unlike Linux, does not have signal-specified exit codes.
Therefore, we should add a Windows-specific check for Windows. If we
don't do this, the logs will always show a failed status, even though
the command actually works correctly.
If you check the Go source code in exec_windows.go, you will see that it
always returns exit code 1.
![image](https://github.com/go-gitea/gitea/assets/30816317/9dfd7c70-9995-47d9-9641-db793f58770c)
The exit code 1 does not exclusively signify a SIGNAL KILL; it can
indicate any issue that occurs when a program fails.
(cherry picked from commit 423372d84ab3d885e47d4a00cd69d6040b61cc4c)
- When a user goes opens a symlink file in Forgejo, the file would be
rendered with the path of the symlink as content.
- Add a button that is shown when the user opens a *valid* symlink file,
which means that the symlink must have an valid path to an existent
file and after 999 follows isn't a symlink anymore.
- Return the relative path from the `FollowLink` functions, because Git
really doesn't want to tell where an file is located based on the blob ID.
- Adds integration tests.
Fixes#29101
Related #29298
Discard all read data to prevent misinterpreting existing data. Some
discard calls were missing in error cases.
---------
Co-authored-by: yp05327 <576951401@qq.com>
(cherry picked from commit d6811baf88ca6d58b92d4dc12b1f2a292198751f)
Fixes the reason why #29101 is hard to replicate.
Related #29297
Create a repo with a file with minimum size 4097 bytes (I use 10000) and
execute the following code:
```go
gitRepo, err := gitrepo.OpenRepository(db.DefaultContext, <repo>)
assert.NoError(t, err)
commit, err := gitRepo.GetCommit(<sha>)
assert.NoError(t, err)
entry, err := commit.GetTreeEntryByPath(<file>)
assert.NoError(t, err)
b := entry.Blob()
// Create a reader
r, err := b.DataAsync()
assert.NoError(t, err)
defer r.Close()
// Create a second reader
r2, err := b.DataAsync()
assert.NoError(t, err) // Should be no error but is ErrNotExist
defer r2.Close()
```
The problem is the check in `CatFileBatch`:
79217ea63c/modules/git/repo_base_nogogit.go (L81-L87)
`Buffered() > 0` is used to check if there is a "operation" in progress
at the moment. This is a problem because we can't control the internal
buffer in the `bufio.Reader`. The code above demonstrates a sequence
which initiates an operation for which the code thinks there is no
active processing. The second call to `DataAsync()` therefore reuses the
existing instances instead of creating a new batch reader.
(cherry picked from commit f74c869221624092999097af38b6f7fae4701420)
If a documentation file is marked with a `linguist-documentation=false`
attribute, include it in language stats.
However, make sure that we do *not* include documentation languages as
fallback.
Added a new test case to exercise the formerly buggy behaviour.
Problem discovered while reviewing @KN4CK3R's tests from gitea#29267.
Signed-off-by: Gergely Nagy <forgejo@gergo.csillger.hu>
Based on @KN4CK3R's work in gitea#29267. This drops the custom
`LinguistBoolAttrib` type, and uses `optional.Option` instead. I added
the `isTrue()` and `isFalse()` (function-local) helpers to make the code
easier to follow, because these names convey their goal better than
`v.ValueorDefault(false)` or `!v.ValueOrDefault(true)`.
Signed-off-by: Gergely Nagy <forgejo@gergo.csillger.hu>
The test suite was broken e.g. on Debian 12 due to requiring a very
recent version of Git installed on the system. This commit skips SHA256
tests in the git module, if a Git version older than 2.42 or gogit is used.
With this option, it is possible to require a linear commit history with
the following benefits over the next best option `Rebase+fast-forward`:
The original commits continue existing, with the original signatures
continuing to stay valid instead of being rewritten, there is no merge
commit, and reverting commits becomes easier.
Closes#24906
- In Git version v2.43.1, the behavior of `GIT_FLUSH` was accidentially
flipped. This causes Forgejo to hang on the `check-attr` command,
because no output was being flushed.
- Workaround this by detecting if Git v2.43.1 is used and set
`GIT_FLUSH=0` thus getting the correct behavior.
- Ref: https://lore.kernel.org/git/CABn0oJvg3M_kBW-u=j3QhKnO=6QOzk-YFTgonYw_UvFS1NTX4g@mail.gmail.com/
- Resolves#2333.
Replace #28849. Thanks to @yp05327 for the looking into the problem.
Fix#28840
The old behavior of newSignatureFromCommitline is not right. The new
parseSignatureFromCommitLine:
1. never fails
2. only accept one format (if there is any other, it could be easily added)
And add some tests.
(cherry picked from commit a24e1da7e9e38fc5f5c84c083d122c0cc3da4b74)
Recognise the `linguist-documentation` and `linguist-detectable`
attributes in `.gitattributes` files, and use them in
`GetLanguageStats()` to make a decision whether to include a particular
file in the stats or not.
This allows one more control over which files in their repositories
contribute toward the language statistics, so that for a project that is
mostly documentation, the language stats can reflect that.
Fixes#1672.
Signed-off-by: Gergely Nagy <forgejo@gergo.csillger.hu>
(cherry picked from commit 6d4e02fe5f)
(cherry picked from commit ee1ead8189)
(cherry picked from commit 2dbec730e8)
When trying to find a `README.md` in a `.profile` repo, do so case
insensitively. This change does not make it possible to render readmes
in formats other than Markdown, it just removes the hard-coded
"README.md".
Also adds a few tests to make sure the change works.
Fixes#1494.
Signed-off-by: Gergely Nagy <forgejo@gergo.csillger.hu>
(cherry picked from commit edd219d8e9)
(cherry picked from commit 2c0105ef17)
(cherry picked from commit 3975a9f3aa)
(cherry picked from commit dee4a18423)
(cherry picked from commit 60aee6370f)
## Purpose
This is a refactor toward building an abstraction over managing git
repositories.
Afterwards, it does not matter anymore if they are stored on the local
disk or somewhere remote.
## What this PR changes
We used `git.OpenRepository` everywhere previously.
Now, we should split them into two distinct functions:
Firstly, there are temporary repositories which do not change:
```go
git.OpenRepository(ctx, diskPath)
```
Gitea managed repositories having a record in the database in the
`repository` table are moved into the new package `gitrepo`:
```go
gitrepo.OpenRepository(ctx, repo_model.Repo)
```
Why is `repo_model.Repository` the second parameter instead of file
path?
Because then we can easily adapt our repository storage strategy.
The repositories can be stored locally, however, they could just as well
be stored on a remote server.
## Further changes in other PRs
- A Git Command wrapper on package `gitrepo` could be created. i.e.
`NewCommand(ctx, repo_model.Repository, commands...)`. `git.RunOpts{Dir:
repo.RepoPath()}`, the directory should be empty before invoking this
method and it can be filled in the function only. #28940
- Remove the `RepoPath()`/`WikiPath()` functions to reduce the
possibility of mistakes.
---------
Co-authored-by: delvh <dev.lh@web.de>
This should fix https://github.com/go-gitea/gitea/issues/28927
Technically older versions of Git would support this flag as well, but
per https://github.com/go-gitea/gitea/pull/28466 that's the version
where using it (object-format=sha256) left "experimental" state.
`sha1` is (currently) the default, so older clients should be unaffected
in either case.
Signed-off-by: jolheiser <john.olheiser@gmail.com>
When LFS hooks are present in gitea-repositories, operations like git
push for creating a pull request fail. These repositories are not meant
to include LFS files or git push them, that is handled separately. And
so they should not have LFS hooks.
Installing git-lfs on some systems (like Debian Linux) will
automatically set up /etc/gitconfig to create LFS hooks in repositories.
For most git commands in Gitea this is not a problem, either because
they run on a temporary clone or the git command does not create LFS
hooks.
But one case where this happens is git archive for creating repository
archives. To fix that, add a GIT_CONFIG_NOSYSTEM=1 to disable using the
system configuration for that command.
According to a comment, GIT_CONFIG_NOSYSTEM is not used for all git
commands because the system configuration can be intentionally set up
for Gitea to use.
Resolves#19810, #21148
Nowadays, cache will be used on almost everywhere of Gitea and it cannot
be disabled, otherwise some features will become unaviable.
Then I think we can just remove the option for cache enable. That means
cache cannot be disabled.
But of course, we can still use cache configuration to set how should
Gitea use the cache.
The 4 functions are duplicated, especially as interface methods. I think
we just need to keep `MustID` the only one and remove other 3.
```
MustID(b []byte) ObjectID
MustIDFromString(s string) ObjectID
NewID(b []byte) (ObjectID, error)
NewIDFromString(s string) (ObjectID, error)
```
Introduced the new interfrace method `ComputeHash` which will replace
the interface `HasherInterface`. Now we don't need to keep two
interfaces.
Reintroduced `git.NewIDFromString` and `git.MustIDFromString`. The new
function will detect the hash length to decide which objectformat of it.
If it's 40, then it's SHA1. If it's 64, then it's SHA256. This will be
right if the commitID is a full one. So the parameter should be always a
full commit id.
@AdamMajer Please review.