Refresh

This website gregbenzphotography.com/photography-tips/photoshop-generative-fill-text-to-image-ai/ is currently offline. Cloudflare's Always Online™ shows a snapshot of this web page from the Internet Archive's Wayback Machine. To check for the live version, click Refresh.

Photoshop’s amazing new AI “Generative Fill”

Adobe just added some of their “Firefly” generative artificial intelligence (AI) directly into Photoshop as a new beta feature called “generative fill“. It’s a very exciting development, with the potential to offer something MidJourney, Dall-E, and Stable Diffusion cannot… deep integration into Photoshop. What benefits might that offer?

  • Native support in Photoshop. There are already great plugins to use tools like Stable Diffusion, but Adobe can offer a richer interface. You can create a selection and work directly with your current image. Ultimately, this offers the potential for greater user control and a richer interface, as well as the convenience of doing all your work right inside Photoshop.
  • Generate objects. You provide both the source image and location and description of where you’d like to add something, and the AI does the work of combining them.
  • Remove objects. It’s like “content-aware fill” on steroids. I find that it can offer better results than the new remove tool in many cases (though the brushing workflow is very nice and they both have their uses).
  • Revise objects. Want to change from a yellow shirt to a red one? Just select the shirt with the lasso tool and tell Photoshop what you want.
  • Expand images. You can push things pretty far and it often provides much better results than content-aware fill. Generative fill seems to work better in detailed areas and content aware seems to excel with gradients or areas of low detail such as the sky, so using both together may produce the best results.
  • Create new backgrounds. You can generate content from nothing, which may be ideal if you need a backdrop for a subject.
  • Fewer legal / commercial / ethical concerns. Firefly has been trained on Adobe stock, so there is much less copyright concern with the source data used to train the AI. I’m no expert on the contractual terms and legal matters here, but certainly this source has significant benefits over scraping content from Pinterest, Flickr, etc which does not include model or property releases. See Adobe’s FAQ for more details.

There are several ways you can invoke generative fill:

  • The Lumenzia Basics panel’s “Fill” button now offers generative fill (when the feature is enabled in PS). This not only gives one-button access to use it, but includes other enhancements:
    • It also includes a feature to “auto-mask subject“. This allows you to easily resize, rotate, or move content you’ve added without edge concerns. When you use this option to create a new fill layer, the last mask will automatically update to isolate the subject anytime you update the layer. This prevents issues with surrounding water, clouds, etc failing to match the new surroundings after you transform your subject.
    • You can easily expand your image. Just use the crop tool (<C>) to expand the edges of the image and then click “Fill”. When using generative, just leave the prompt blank.
  • Make a selection with a tool such as the lasso, quick selection, or subject select and then go to Edit / Generative Fill.
  • Via the “Contextual Task Bar” (Window / Contextual Task Bar). Whenever you create a selection, you’ll see a “generative fill” button in this floating task bar. Tip: turn on “sample all layers” for quick selection / subject select, as they won’t work very well once you start creating multiple layers.
  • Via voice commands through MacOS or Windows when using any of the above methods:
    • MacOS: Setup via System Preferences / Keyboard / Dictation. Enable dictation and set the “shortcut” you prefer (“Press Control Key Twice” works great with external keyboards). Use your shortcut, speak, and click <return> or your shortcut key.
    • Windows: Start by pressing the Windows logo key + H, or the microphone key next if you have one.
  • Revise an existing generative fill layer by selecting the layer and opening the properties panel. You can click “generate” to create new options or change the text prompt to refine your concept. You can also select from other variations.

Note that the generative fill layer is created as a Smart Object with one layer for each version you see in the properties (you can right-click the layer and choose “convert to layers” to actually see this). This has a couple of implications:

  • Each of these layers does take a bit of space, so clicking the “x” on unused versions will help reduce your file size (you may also rasterize the layer to save further space if you are done revising it).
  • You can non-destructively apply filters to the layer.

Capability like this naturally raises ethical questions around truth in imagery and it should also be noted that Generative Fill is designed to work with Content Credentials. This is an initiative involving companies like Adobe and the New York Times to create standards and a trail of evidence to help differentiate between original and altered content.

 

How good is it, and where do we go from here?

Is this a perfect AI? No, of course not – but that isn’t the goal at this stage. Adobe is making that very clear by releasing this as a feature only available in the beta version of Photoshop. This is what software developers call an MVP (minimum viable product). It’s a chance to get user input and more experience to help build the real product. You should expect that (a) it has lots of limitations now and (b) it will get much better in the future. At this time, this is a tool best used for fun and experimentation at social media resolutions. Commercial usage is prohibited during the beta phase. But it’s very exciting to get a glimpse of where things are likely headed. All the use cases above are interesting to me and would will immensely beneficial with sufficient quality.

Even if you see no relevance to this kind of AI for your work in the near future, that’s unlikely to remain the case years from now. AI tools like this are going to be constantly evolving. Most people hadn’t heard of ChatGPT until it reached version 4, and this isn’t even version 1 of “generative fill”, “Firefly”, or whatever the product will be called over time. It’s an extremely exciting development with enormous potential to alleviate tedious work and open up new creative avenues for exploration.

Personally, I’m most excited about the potential for better methods of removing distractions from my images. Cloning is tedious work. I’ll probably expand some image edges as needed for certain formats and cropping factors. I’d be happy to make some tweaks to alter some colors. However, I don’t see myself adding subjects to images because I focus on creating images which share the experience of a place. The video above is just meant to give some sense of what’s possible. I’m not going to be adding fake animals to my portfolio images.

Everyone’s needs are different. This could be a great aid for someone who doesn’t have model releases for marketing work to simply swap real people with invented ones. Some people want to create fantasy images. There are so many potential uses, and I think ultimately the evolution will take a winding path as developers find out what people really want (and are willing to pay for). That said, I think there are some fairly clear avenues of continued improvement for tools like this.

 

Adobe’s standalone / website version (Firefly) already has several additional features would be very useful in Photoshop, including:

  • Category options for style, color and tone, lighting, etc. Many of these are less necessary in this context when you’re filling a portion of the image (vs generating something from nothing on the website), but I do think a somewhat guided experience may provide more clarity in some cases. For example, a blank prompt currently may remove a selected object or not – what’s the right approach? There is much to be learned about what interface works best, but I suspect  a simple open-ended text input may feel a bit daunting for those who aren’t experts in “prompt engineering”.
  • Text effects” to create gorgeous visual fonts. Fonts have many needs which are unique from image content and options here will certainly be appreciated by users such as graphic designers.

Beyond that, there are several potential ways to expand the capability:

  • Higher resolution. The current results are limited to social media sizes, this isn’t something you’re going to print right now. Anytime you generate content, it is created at a maximum of 1024×1024 (though will be upscaled from there automatically to fill the target space). This isn’t surprising given Adobe is providing this for free at what must be significant costs to run on their servers, but obviously there will be a lot of demand for higher resolution output in the future.
  • Improved image quality. There are artifacts, matching to the ambient lighting is hit or miss, the results may look more like art than a photograph, etc. This will obviously improve over time and I’m excited to see how it evolves. Whether training the AI from Adobe stock is a limiting factor in the long run remains to be seen – that catalog reportedly includes hundreds of millions of images (vs billions used for Stable Diffusion). I suspect that as AI models continue to improve to work with less data, the quality of the training images is going to be more important than the quantity. This will undoubtedly improve significantly in time.
  • Improved user interface. The current design is very basic, as if to drive home the point that it’s a beta. You can’t just click <enter> after typing your text, you can’t double-click the layer to access properties for further editing, no option in the toolbar for the lasso tool, clicking OK before generate leaves an empty layer, only a few iterations offered at a time, no way to specify a seed to revise an existing version, the previews are small, etc.
  • Negative prompts. You can’t currently type “remove the sign”, though selecting an object and typing nothing often will help remove it (though other times you just get another random object).
  • Better support to revise existing content. Unlike the demo I showed with the Stable Diffusion plugin (where I turned my photograph into a charcoal sketch), there isn’t quite the same mechanism for style transfer with generative fill. I can select someone’s shirt and type “blue shirt” to change color. But if I select the whole image and type “charcoal drawing”, the result will bear no resemblance to the original photo. This kind of capability would be nice for altering the entire image (day to night conversions, changing the weather, time of day, style transfers, etc). And the quality of the result isn’t the same. If I try to select my mouth and type “frown” or “closed lip smile”, I don’t get what that result.
  • On-device processing. The beta release of generative fill runs in the cloud, which means you have to be connected to the internet. Processing on your own computer would allow offline use and probably faster processing time.
  • AI assisted compositing. Rather than using a text prompt to create new content, imaging that you just provide a background and a subject – then Photoshop cuts out the subject with no visible edge, matches color and tone, and creates shadows or reflections to complete the composite for you.
  • More flexible input. Support for languages other than English is key. It also needs to be more tolerant of typos (“brooom” should be recognized as an attempt to type “broom”). It’d be nice if you could use the arrow keys to cycle through the versions you create. And while you can already use your voice, imagine a richer interface where you give an initial idea (“add some pine trees”) and then continue to refine it with feedback (“make the middle one taller, show light rays coming through the trees, and warm up the shadows”).
  • Support for 32-bit HDR. Photoshop’s tools for cleaning up 32-bit images are limited to the clone stamp. There is no healing, spot healing, patch, or remove tool support. It would be very helpful to be able to remove things like lens flare in HDR images.

There are an unlimited number of potential use cases here and it will be very exciting to see where the technology goes over time. What do you think? What capabilities and interface would you like to see for this sort of generative fill in Photoshop? I’d love to hear your thoughts in the comments below.

Troubleshooting

I’ve had various emails and comments I want to address if you are unable to use Generative fill.

If you do not see an option for Generative Fill, please check the following:

  • Make sure you have installed the PS beta. Help / System Info must show 24.6.0 20230520.m.2181. Several people reported installing the beta initially and seeing a build older than “2181”.
  • Make sure you are running the PS beta (when you install the beta, it keeps the regular version and you can run either).
  • Check that your age shows at least 18 in Adobe’s system (contact [email protected] if unsure). Generative Fill in Photoshop (Beta) is only available to users of at least 18 years of age.
  • Make sure you have a supported license type: Creative Cloud Individual license, CC Teams, CC Enterprise, or Educational.
  • Note: Generative Fill is not available in China at this time.
  • See the official Adobe support page

If you see Generative Fill, but it is greyed out / unavailable, please check the following:

  • Note: hovering over the Generate button should show a tooltip explaining why the feature is unavailable.
  • Make sure you have a selection
  • Make sure your image is in 8 / 16-bit mode (32-bit HDR is not supported)
  • Use the “Fill” button in the Basics panel if you own Lumenzia. It is designed to address some edge cases (other than 32-bits, which is a fundamental requirement of the tool).

If you do not see the Generative Fill option in Lumenzia Basics, please check that you can see it as an option under the “Edit” menu in Photoshop and that you have updated to Lumenzia v11.4.0 or later.

Greg Benz Photography