Notes & TILs
Search…
Generate a RSS Feed of recent files inside a Git repository
Posted on 16 Mar, 2021
I like to log my notes & TILs in a git repository and recently I had an idea to showcase (& automate) my most recent learnings on my github profile.
Github is nice enough to provide us with RSS feeds for the latest commits inside a repo but it lacks the most basic thing of telling me what commit introduced a new file (i.e recent files in a repo).
The following command will show each new relative path that was added to the git history along with the commit date (sorted by most recent).
1
git log --no-color --date=format:'%d %b, %Y' --diff-filter=A --name-status --pretty='%ad'
Copied!
If you just want the file names, leave the --pretty option empty
1
git log --no-color --date=format:'%d %b, %Y' --diff-filter=A --name-status --pretty=''
Copied!
You should see something like this
1
A scripts/oib
2
A scripts/surf
3
A snippets/python.snippets
4
A snippets/markdown.snippets
5
A .Xmodmap
6
A codesnippets/go.md
7
A scripts/areyouok.go
8
A scripts/convert-to-gif.sh
9
A scripts/backup_as_gist.py
10
...
Copied!
To generate recent N results use the -n flag
1
git log --no-color -n 5 --date=format:'%d %b, %Y' --diff-filter=A --name-status --pretty=''
Copied!
If you want to follow renames as well,
1
git log --no-color --date=format:'%d %b, %Y' --diff-filter=AR --name-status --pretty=''
Copied!
The magic here is done by the --diff-filter=A option that only shows files that were Added. I remember using this to find birthday of README files.
NOTE: We are assuming that the file creation date to be the date of the commit that introduced the file and since its a Feed for a git repo, this should make sense (I was born when I was committed 😁️)
1
#!/usr/bin/env python3
2
​
3
# Script to generate a feed of recently committed files in a git repository
4
​
5
# TODO: Add Commit Author #
6
​
7
import subprocess as sp
8
import pathlib
9
import re, os
10
import datetime
11
from dateutil.parser import parse
12
​
13
HEAD = """<?xml version="1.0" encoding="UTF-8" ?>
14
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
15
<channel>
16
"""
17
​
18
FOOTER = """</channel>
19
</rss>
20
"""
21
​
22
# Assuming your current working dir is the repo
23
repo_name = os.path.basename(os.getcwd())
24
current_date = datetime.datetime.now().strftime("%a, %d %b %Y")
25
​
26
​
27
def get_recent_files():
28
cmd = "git log --no-color -n 10 --date=rfc --diff-filter=A --name-status --pretty='%ad'"
29
result = sp.Popen(cmd, shell=True, stdout=sp.PIPE, stderr=sp.PIPE)
30
out, err = result.communicate()
31
clean_output = out.decode("utf-8").replace("A\t", "").split("\n")
32
clean_output = list(filter(lambda x: x != "", clean_output))
33
​
34
files = []
35
for item in clean_output:
36
if is_valid(item):
37
date = item
38
elif pathlib.Path(item).exists():
39
entry = item, date
40
files.append(entry)
41
return files
42
​
43
​
44
def is_valid(date):
45
try:
46
if isinstance(parse(date), datetime.datetime):
47
return True
48
except ValueError:
49
return False
50
​
51
​
52
def get_repo_link():
53
repo_origin = "git config --get remote.origin.url"
54
result = sp.Popen(repo_origin, shell=True, stdout=sp.PIPE, stderr=sp.PIPE)
55
result, err = result.communicate()
56
return result.decode("utf-8")
57
​
58
​
59
if __name__ == "__main__":
60
files = get_recent_files()
61
with open("feed.xml", "w") as feed:
62
feed.write(HEAD)
63
feed.write(
64
f"""<title>{repo_name}.git</title>\n<link>{get_repo_link()}</link>\n"""
65
)
66
feed.write(
67
f"""<description>Recently committed files in {repo_name}</description>\n"""
68
)
69
feed.write(f"""<lastBuildDate>{current_date}</lastBuildDate>""")
70
for item in files:
71
feed.write("""<item>\n""")
72
feed.write(f"""<title>{item[0]}</title>\n""")
73
feed.write(f"""<pubDate>{item[1]}</pubDate>\n""")
74
feed.write("""</item>\n""")
75
feed.write(FOOTER)
Copied!
The XML generated is valid enough to be consumed without any issue. Here is a demo of output from the above script.
1
<?xml version="1.0" encoding="UTF-8" ?>
2
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
3
<channel>
4
<title>til.git</title>
5
<link>https://github.com/Bhupesh-V/til
6
</link>
7
<description>Recently committed files in til</description>
8
<lastBuildDate>Wed, 17 Mar 2021</lastBuildDate><item>
9
<title>Shell/generate-feed-for-files-in-git-repo.md</title>
10
<pubDate>Tue, 16 Mar 2021 16:24:47 +0530</pubDate>
11
</item>
12
<item>
13
<title>recent_tils.json</title>
14
<pubDate>Tue, 16 Mar 2021 16:24:47 +0530</pubDate>
15
</item>
16
<item>
17
<title>Shell/parsing-git-status-for-tracked-untracked-changes.md</title>
18
<pubDate>Mon, 15 Mar 2021 19:26:12 +0530</pubDate>
19
</item>
20
<item>
21
<title>Shell/get-current-git-branch-name.md</title>
22
<pubDate>Sat, 13 Mar 2021 13:11:02 +0530</pubDate>
23
</item>
24
<item>
25
<title>Miscellaneous/chaos-engineering-collected-notes.md</title>
26
<pubDate>Sun, 7 Mar 2021 19:42:28 +0530</pubDate>
27
</item>
28
<item>
29
</channel>
30
</rss>
Copied!
Last modified 19d ago
Copy link