Working on workloads and building

7e6c5e36 · Milo Craun · 8ed1521e · 7e6c5e36 · 7e6c5e36 · 7e6c5e36
Commit 7e6c5e36 authored 11 months ago by Milo Craun
--- a/README.md
+++ b/README.md
@@ -36,3 +36,16 @@ For more info on the vectorizer go [vectorizer](https://gcc.gnu.org/projects/tre
 The official ARM NEON page has a good overview of the NEON architecture for vector instructions.
 It can be found [here](https://developer.arm.com/documentation/102474/0100/Overview?lang=en).
 Key takeaways are that Vector instructions act on vector registers (128-bit or 64-bit).
+
+# 2024-04-13 - Milo
+## First workload
+
+I added a basic image processing task to the repo.
+The file `img_gray.c` will convert an RBG image into grayscale
+  by computing the luminance of each pixel.
+This should be heavily vectorizable, and should see a great improvement.
+
+## Build Script
+
+Currently writing a build script to build vectorized and non-vectorized objects for
+x86 and ARM automatically.
--- a/build.sh
+++ b/build.sh
+#!/usr/bin/bash
+
+# This shell script will build vectorized and un-vectorized versions of code
+# for x86_64 and  AArch64
+# Must have the AArch64 cross compiler placed in your home dir under ~/aarch
+
+# Build the x86 with O1 and -ftree-vectorize -fopt-info-vec-all and static
+
+ARM=~/aarch/bin/aarch64-none-linux-gnu-gcc
+echo "Building x86"
+
+echo "Vectorized"
+gcc -static -O1 -ftree-vectorize -msse -fopt-info-vec-all $1.c -o $1-x86-vec
+
+echo "No vectorize"
+gcc -static -O1  -msse $1.c -o $1-x86-nvec
+
+echo "Building AArch64"
+
+echo "Vectorized"
+$ARM -static -O1 -ftree-vectorize -fopt-info-vec-all $1.c -o $1-arm-vec
+
+echo "No vectorize"
+$ARM -static -O1 $1.c -o $1-arm-nvec
--- a/img_gray.c
+++ b/img_gray.c
+/**
+ * A very basic (and abstracted) image processing filter that 
+ * converts an RBG image (represented as 3 matrices) into a
+ * grayscale image using the CIE 1931 color space
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <math.h>
+#define num 64
+
+void convert(float ** R, float ** G, float ** B, float ** lum) 
+{
+    // Formula: Y = 0.2126*R + 0.7152*G + 0.0722*B
+    for (int y = 0; y < num; y++) {
+        for (int x = 0; x < num; x++) {
+            lum[y][x] = 0.2126*R[y][x] + 0.7152*G[y][x] + 0.0722*B[y][x];
+        }
+    }
+}
+
+int main() {
+    float ** R = (float **)malloc(num * sizeof(float *));
+    float ** G = (float **)malloc(num * sizeof(float *));
+    float ** B = (float **)malloc(num * sizeof(float *));
+    float ** lum = (float **)malloc(num * sizeof(float *));
+
+    for (int i = 0; i < num; i++) {
+        R[i] = (float *)malloc(num * sizeof(float));
+        G[i] = (float *)malloc(num * sizeof(float));
+        B[i] = (float *)malloc(num * sizeof(float));
+        lum[i] = (float *)malloc(num * sizeof(float));
+    }
+
+    for (int y = 0; y < num; y++) {
+        for (int x = 0; x < num; x++) {
+            R[y][x] = x * 0.2;
+            G[y][x] = x * 0.3;
+            B[y][x] = x * 0.4;
+            lum[y][x] = 0.0;
+        }
+    }
+
+    convert(R, G, B, lum);
+    printf("Success!\n");
+    return 0;
+}