Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added semgrep and some fixes #88

Merged
merged 4 commits into from
Feb 27, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions .github/workflows/semgrep.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
name: Semgrep SAST Scan

on:
pull_request:

jobs:
semgrep:
# User definable name of this GitHub Actions job.
name: semgrep/ci
# If you are self-hosting, change the following `runs-on` value:
runs-on: ubuntu-latest
container:
# A Docker image with Semgrep installed. Do not change this.
image: returntocorp/semgrep
# Skip any PR created by dependabot to avoid permission issues:
if: (github.actor != 'dependabot[bot]')
permissions:
# required for all workflows
security-events: write
# only required for workflows in private repositories
actions: read
contents: read

steps:
# Fetch project source with GitHub Actions Checkout.
- name: Checkout repository
uses: actions/checkout@v4

- name: Perform Semgrep Analysis
# @NOTE: This is the actual semgrep command to scan your code.
# Modify the --config option to 'r/all' to scan using all rules,
# or use multiple flags to specify particular rules, such as
# --config r/all --config custom/rules
run: semgrep scan -q --sarif --config auto --config "p/secrets" . > semgrep-results.sarif

- name: Pretty-Print SARIF Output
run: |
jq . semgrep-results.sarif > formatted-semgrep-results.sarif || echo "{}"
echo "Formatted SARIF Output (First 20 lines):"
head -n 20 formatted-semgrep-results.sarif || echo "{}"

- name: Validate JSON Output
run: |
if ! jq empty formatted-semgrep-results.sarif > /dev/null 2>&1; then
echo "⚠️ Semgrep output is not valid JSON. Skipping annotations."
exit 0
fi

- name: Add PR Annotations for Semgrep Findings
run: |
total_issues=$(jq '.runs[0].results | length' formatted-semgrep-results.sarif)
if [[ "$total_issues" -eq 0 ]]; then
echo "✅ No Semgrep issues found!"
exit 0
fi

jq -c '.runs[0].results[]' formatted-semgrep-results.sarif | while IFS= read -r issue; do
file=$(echo "$issue" | jq -r '.locations[0].physicalLocation.artifactLocation.uri')
line=$(echo "$issue" | jq -r '.locations[0].physicalLocation.region.startLine')
message=$(echo "$issue" | jq -r '.message.text')

if [[ -n "$file" && -n "$line" && -n "$message" ]]; then
echo "::error file=$file,line=$line,title=Semgrep Issue::${message}"
fi
done
13 changes: 12 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,18 @@ ARG BRANCH=main
## Clone the repository with the specified branch
RUN git clone --branch ${BRANCH} https://github.com/luxonis/datadreamer.git

## Create a non-root user and switch to that user
RUN adduser --disabled-password --gecos "" non-root && \
chown -R non-root:non-root /app

## Switch to the non-root user
USER non-root

## Install the Python package as the non-root user
RUN cd datadreamer && pip install .

## define image execution
## Set PATH for the installed executable
ENV PATH="/home/non-root/.local/bin:/usr/local/bin:$PATH"

## Define image execution
ENTRYPOINT ["datadreamer"]
4 changes: 2 additions & 2 deletions datadreamer/dataset_annotation/owlv2_annotator.py
Original file line number Diff line number Diff line change
Expand Up @@ -410,9 +410,9 @@

# Image-driven annotation
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
im = Image.open(requests.get(url, stream=True).raw)
im = Image.open(requests.get(url, stream=True).raw) # nosemgrep

Check failure on line 413 in datadreamer/dataset_annotation/owlv2_annotator.py

View workflow job for this annotation

GitHub Actions / semgrep/ci

Semgrep Issue

Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead.

Check failure on line 413 in datadreamer/dataset_annotation/owlv2_annotator.py

View workflow job for this annotation

GitHub Actions / semgrep/ci

Semgrep Issue

Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead.
query_url = "http://images.cocodataset.org/val2017/000000058111.jpg"
query_image = Image.open(requests.get(query_url, stream=True).raw)
query_image = Image.open(requests.get(query_url, stream=True).raw) # nosemgrep

Check failure on line 415 in datadreamer/dataset_annotation/owlv2_annotator.py

View workflow job for this annotation

GitHub Actions / semgrep/ci

Semgrep Issue

Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead.

Check failure on line 415 in datadreamer/dataset_annotation/owlv2_annotator.py

View workflow job for this annotation

GitHub Actions / semgrep/ci

Semgrep Issue

Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead.

final_boxes, final_scores, final_labels = annotator.annotate_batch(
[im], [query_image], conf_threshold=0.9
Expand Down
2 changes: 0 additions & 2 deletions datadreamer/image_generation/sdxl_turbo_image_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,6 @@ def _init_gen_model(self) -> AutoPipelineForText2Image:
if self.device == "cpu":
base = AutoPipelineForText2Image.from_pretrained(
"stabilityai/sdxl-turbo",
# variant="fp16",
torch_dtype=torch.float32,
use_safetensors=True,
)
base.to("cpu")
Expand Down
Loading