This project builds intuition on how to manipulate images through 2D convolutions as well as learn more about filtering images. We also dive deeper into using image manipulation techniques such as Laplacian and Gaussian stacks.
Here, we are using finite difference operators, mainly Dx and Dy, to find partial derivatives of images. Dx is defined as the row vector [1, -1], where as Dy is defined as the column vector [[1], [-1]]. We can convolve our image with our finite difference operators to find vertical edges from the Dx operator, and horizontal edges from the Dy operator. Afterwards, using our convolved images, we can produce a gradient magnitude image by square rooting the sum of the squares of the partial derivatives of our image, or more formally sqrt(dx^2 + dy^2), where dx and dy are the results from convolving our image with our finite difference operators. Then, to get an edge image, we binarize the gradient magnitude image using a threshold of 70.
cameraman.jpg |
Dx Convolution |
Dy Convolution |
Gradient Magnitude |
Gradient Magnitude Binarized with 70 |
Because the results from using the finite difference operators were noisy, we decided to instead apply a gaussian filter to smooth out the edges. This allows us to better remove noise from the image before applying our finite difference operators. For the gaussian filter, we chose a kernel size = 10, and sigma = 6. After applying the gaussian filter, we can proceed with the same techniques from before. The results are as follows:
cameraman.jpg |
Gaussian filter with ksize=10 and sigma=6 |
Dx Convolution |
Dy Convolution |
Gradient Magnitude |
Gradient Magnitude Binarized with 10 |
Compared to the binarized gradient magnitude image from before, the gradient magnitude image here is more pronounced and more noticeable. The edges appear thicker, allowing them to stand out more. We also see less noise from the background, becuase the gaussian filter acts as a low pass filter that filters out the high frequencies.
Because convolutions are commutative and associative, we can create derivative of gaussian filters to speed up our computations. Specifically, we can first apply our finite difference operator onto our gaussian filter first to get our Derivative of Gaussian filters DoGx and DoGy, and then convolve our original image with the DoGx and DoGy filters to get our partial derivatives.
DoGx Convolution |
DoGy Convolution |
Gradient Magnitude |
Gradient Magnitude Binarized with 10 |
Ignoring a tiny bit of noise, the results are identical after binarizing the gradient magnitude image. This is expected, since convolutions are commutative and associative.
Here, we sharpen images by adding higher frequencies by deriving the unsharp masking techniques. It starts off by convolving our unsharp image with a gaussian filter to generate a blurred version of the image. Since the gaussian filter acts as a low pass filter, the blurred image contains only the low frequencies of the original image. To get the high frequencies of an image, we subtract the original image by the blurred image to get the high frequencies of an image. Afterwards, we can add however many high frequencies to our original image to make the original image appear sharper. The following results are used with a gaussian filter with kernel size = 13 and sigma = 2. We use alpha to denote how much of our high frequency to add.
taj.jpg |
Blurred from Gaussian filter |
High frequencies |
Sharpened with α = 1 |
Sharpened with α = 5 |
Sharpened with α = 10 |
The same technique is applied to an image of the pyramids of Giza.
giza.jpg |
Blurred |
High frequencies |
Sharpened with α = 2 |
Here, we apply this technique to an image of the salesforce tower in San Francisco. We first apply a gaussian blur to the image, and then attempt to sharpen the blurred image to get back the original image, instead of just sharpening the original image.
salesforce.jpg |
Blurred image |
High frequencies of blurred image |
Blurred image sharpened with α = 3 |
In this next part, we attempt to generate hybrid images from two separate images by combining the high frequencies of one image with the low frequencies of the other image. Because high frequencies dominate at a close viewing distance, and low frequencies dominate at a farther viewing distance, we can combine the high and low frequencies from each respective image into one image to create an effect where we see one image from close up and another image from far away. To get the low frequencies of an image, we apply a gaussian filter to the image. To get the high frequencies of an image, we first apply a gaussian filter, and then subtract the original image from the gaussian filtered image to get the high frequencies. Here are the results:
DerekPicture.jpg |
nutmeg.jpg |
Hybrid |
In this example, I created a hybrid image of my friend Raj with Derrick White, a famous basketball player on the Boston Celtics.
Raj |
Raj high frequency |
Derrick White |
Derrick White low frequency |
Hybrid |
I also displayed the process of creating the image through frequency analysis. Below are the log magnitude of the fourier transform of the two input images, filtered images, and the final hybrid image:
Raj FFT |
Raj high frequency FFT |
Derrick White |
Derrick White low frequency FFT |
Hybrid FFT |
In this example, I combined two of my favorite characters from Oshi No Ko: Aqua and Ruby.
aqua.jpg |
ruby.jpg |
Hybrid |
In this example, I combined a picture of a boulder I found in Lake Tahoe with the Moyai emoji 🗿. The results didn't turn out too well, mainly because of how difficult it was to align the boulder with the moyai emoji.
rock.jpg |
moyai.jpg |
Hybrid |
Here, I generated the gaussian and laplacian stacks for images. These stacks are used for multi-resolution blending later on. The goal of this process is to be able to seamlessly blend two images together using a mask.
To generate the gaussian stack, we first start with the original image and apply a gaussian filter. Here, we used a kernel size = 21 and a sigma = 7. Then, we apply the another gaussian filter on top of the result from the previous level. The process repeats for 5 levels, resulting in a decreasing frequency band at each level.
To generate the laplacian stack, we took each pair of consecutive images from the gaussian stack and subtracting them together to get the high frequencies. The last image of the laplacian stack is the same image as the last image of the gaussian stack. The laplacian stack provides a useful property in that collapsing the stack will give us the original image back.
Gaussian Stack for Apple | Laplacian Stack for Apple |
---|---|
Level 1 |
Level 1 |
Level 2 |
Level 2 |
Level 3 |
Level 3 |
Level 4 |
Level 4 |
Level 5 |
Level 5 |
Gaussian Stack for Orange | Laplacian Stack for Orange |
---|---|
Level 1 |
Level 1 |
Level 2 |
Level 2 |
Level 3 |
Level 3 |
Level 4 |
Level 4 |
Level 5 |
Level 5 |
To blend the two images, we need a binary gaussian stack for the corresponding mask, using the same technique described before by applying the gaussian filter to the mask. Then, we can apply the binary gaussian stack to each laplacian stack by multiplying the corresponding images together. As a side note, we cannot apply the same gaussian stack to both images, so one of the gaussian stacks is built from flipping the mask and then applying the mask gaussian stack to the laplacian stack.
Apple + Mask | Orange + Mask | Blended Stack |
---|---|---|
Level 1 |
Level 1 |
Level 1 |
Level 2 |
Level 2 |
Level 2 |
Level 3 |
Level 3 |
Level 3 |
Level 4 |
Level 4 |
Level 4 |
Level 5 |
Level 5 |
Level 5 |
Here is the result if we were to collapse the blended laplacian stack from the previous section.
The Orapple |
I blended together an image of a beach from Key Largo, Florida with an image of Mount Fuji in Japan.
Gaussian Stack for Beach | Laplacian Stack for Beach |
---|---|
Level 1 |
Level 1 |
Level 2 |
Level 2 |
Level 3 |
Level 3 |
Level 4 |
Level 4 |
Level 5 |
Level 5 |
Gaussian Stack for Mount Fuji | Laplacian Stack for Mount Fuji |
---|---|
Level 1 |
Level 1 |
Level 2 |
Level 2 |
Level 3 |
Level 3 |
Level 4 |
Level 4 |
Level 5 |
Level 5 |
I used a horizontal mask this time instead of a vertical mask. Here are the results:
Level 1 |
Level 2 |
Level 3 |
Level 4 |
Level 5 |
Here are the blended images from applying the mask to each of the laplacian stacks, using the same techniques as before.
Beach + Mask | Fuji + Mask | Blended Stack |
---|---|---|
Level 1 |
Level 1 |
Level 1 |
Level 2 |
Level 2 |
Level 2 |
Level 3 |
Level 3 |
Level 1 |
Level 4 |
Level 4 |
Level 4 |
Level 5 |
Level 5 |
Level 5 |
Seeing Mount Fuji at the Beach |
My friend Diego loves to go to Mexico, so I used an irregular mask to blend him into a town in Mexico. To create the irregular mask, I used photoshop to mask out the area that I wanted to blend, and then applied the same blending technique.
diego.jpg |
mexico.jpg |
Irregular mask |
Diego in Mexico! |
During the 2024 NBA playoffs, fans often compared Anthony Edwards, a superstar on the Minnesota Timerbwolves, to Michael Jordan (from the Chicago Bulls, not from the Berkeley EECS Department). Some have even nicknamed Anthony Edwards to be "Michael Jordan's Son". I decided to blend together images of Anthony Edwards and Michael Jordan, but the results were not so great since the seam was noticeable.
Anthony Edwards |
Michael Jordan |
Michael Jordan's Son |
I had fun learning about how frequency worked on different images, as well as manipulating the different frequencies to create images of my own. I also learned alot about the different ways I can take advantage of the gaussian filters to create new effects.
Website template is credited to CS 184