Google warns against misusing links in syndication & large-scale article campaigns

This article was originally posted on SearchEngineLand and was written by Danny Sullivan.

If the primary purpose of distributing content is to gain links, both authors and publishers risk a Google penalty.

Google’s out today with a warning for anyone who is distributing or publishing content through syndication or other large-scale means: Watch your links.

Google’s post reminds those who produce content published in multiple places that, without care, they could be violating Google’s rules against link schemes.

No content marketing primarily for links, warns Google

Google says that it is not against article distribution in general. But if such distribution is done primarily to gain links, then there’s a problem. From the post:

Google does not discourage these types of articles in the cases when they inform users, educate another site’s audience or bring awareness to your cause or company. However, what does violate Google’s guidelines on link schemes is when the main intent is to build links in a large-scale way back to the author’s site …

For websites creating articles made for links, Google takes action on this behavior because it’s bad for the Web as a whole. When link building comes first, the quality of the articles can suffer and create a bad experience for users.

Those pushing such content want links because links — especially from reputable publishers — are one of the top ways that content can rank better on Google.

Warning signs

What are things that may tip Google into viewing a content distribution campaign as perhaps violating its guidelines? Again, from the post:

Stuffing keyword-rich links to your site in your articles

Having the articles published across many different sites; alternatively, having a large number of articles on a few large, different sites

Using or hiring article writers that aren’t knowledgeable about the topics they’re writing on

Using the same or similar content across these articles; alternatively, duplicating the full content of articles found on your own site

Staying safe

There are two safe ways for those distributing content to stay out of trouble: using nofollow on specific links or the canonical tag on the page itself.

Nofollow prevents individual links from passing along ranking credit. Canonical effectively tells Google not to let any of the links on the page pass credit.

Publishers can be at risk, too

It’s important to note that Google’s warning isn’t just for those distributing content. Those publishing it can face issues with Google if they haven’t taken proper care. From Google’s post:

When Google detects that a website is publishing articles that contain spammy links, this may change Google’s perception of the quality of the site and could affect its ranking.

Sites accepting and publishing such articles should carefully vet them, asking questions like: Do I know this person? Does this person’s message fit with my site’s audience? Does the article contain useful content? If there are links of questionable intent in the article, has the author used rel=”nofollow” on them?

In other words, publishing content unquestioningly, in terms of links, could expose the publisher’s site to being penalized in Google.

Why this new warning?

Today’s warning from Google is generally the same as what it issued back in July 2013, when it cautioned about links in large-scale guest posting, advertorials, sponsored content and press releases. However, it’s more specific in terms of syndication and comes because of an issue that Search Engine Land has been investigating over the past month.

Search Engine Land has a policy of generally not writing about cases of spam or suspected spam that aren’t already public in a significant way. Our open letter from 2014 explains this more. In short, if we did this, that’s all we would ever be writing about.

That said, we received a tip about several businesses using article syndication that seemed worth taking a closer look at, given that the tactics potentially violated Google’s guidelines in a significant manner. Moreover, Google had been notified of the issue at the end of last year, twice, but had not apparently taken any action. The company tipping us — a competitor with those businesses — was concerned. Was this tactic acceptable or not?

The many examples I looked at certainly raised concerns. Articles were distributed across multiple news publications. The articles often contained several links that were “anchor rich,” meaning they appeared to have words within the links that someone hoped they might rank well for. Mechanisms for blocking these links from passing credit were not being used.

Google’s initial response to our questions about this was that it was aware there were issues and that it was looking to see how it might improve things.

That seemed a weak response to me. It was pretty clear from my conversations with two of the companies distributing the content, and one of the publishers, that there was, at the very least, confusion about what was acceptable and responsibilities all around.

Confusion about what’s allowed

Both the companies producing content professed that they felt they were doing nothing wrong. In particular, they never demanded that publishers carry any particular links, which seemed to them to put them on the right side of the guidelines. One also said that it was using canonical to block link credit but that the publishers themselves might be failing to implement that correctly. Both indicated that if they weren’t doing things correctly, they wanted to change to be in compliance.

In short: it’s not us to blame, it’s those publishers. And from the content I looked at on publisher sites, it was pretty clear that none of them seemed to be doing any policing of links. That was reinforced after I talked with one publisher, which told me that while it did make use of nofollow, it was reviewing things to be more “aggressive” about it now. My impression was that if nofollow was supposed to be used, no one had really been paying attention to that — nor was I seeing it in use.

In the end, I suggested to Google that the best way forward here might be for them to post fresh guidance on the topic. That way, Search Engine Land wasn’t being dragged into a potential spam reporting situation. More important, everyone across the web was getting an effective “reset” and reeducation on what’s allowed in this area.

Getting your house in order

Now that such a post has been done, companies distributing such content and publishers carrying it would be smart to follow the advice in it. When Google issues such advice, as it did about guest blogging in January 2014, that’s often followed by the search engine taking action against violators a few months later.

From a distributor point of view, I’d recommend thinking strongly about how Google ended today’s blog post:

If a link is a form of endorsement, and you’re the one creating most of the endorsements for your own site, is this putting forth the best impression of your site? Our best advice in relation to link building is to focus on improving your site’s content and everything–including links — will follow (no pun intended).

Bottom line: Deep down, you know if you were putting out this content primarily to gain links. If that was the case, you should work with those publishers to implement nofollow or canonical. If you can’t, then you should consider disavowing links to Google.

Going forward, I’d look to implement nofollow or canonical as Google recommends, if you find that the large-scale distribution is bringing you useful direct clicks and attention.

I will say that no one should take this to mean that you can never distribute content or that content can’t have any links at all that pass credit back to an originating site. Indeed, we have plenty of contributed content here on Search Engine Land. I’d be among the first screaming at Google if I thought it was trying to tell us or anyone that you couldn’t have such content unless you blocked all links.

Things that make us feel Google-safe are that, most of all, we publish original content from contributors. It’s not the same content that’s simply dumped into multiple publications. Also, we have editors who often spend a significant amount of time working with writers and content to ensure that it’s publication-worthy. And we do try to watch for links that we don’t feel are earned or necessary in a story.

We’re not perfect. No publisher will be. But I think from a publisher perspective, the more you are actually interacting with the content you publish to review and approve it, rather than blindly posting from a feed, the safer you will be. If you haven’t been doing that, then consider making use of nofollow and canonical on already-published content, as Google recommended.

As for those guest blogging requests

I’ll conclude with this part of Google’s post today:

Webmasters generally prefer not to receive aggressive or repeated “Post my article!” requests, and we encourage such cases to be reported to our spam report form.

Indeed. It’s amazing how many requests that we’re getting like this each day, and I know were not alone. It’s even more amazing when this type of guest blogging was supposed to be over.

Stick A Fork In It, Guest Blogging Is Done,” declared Matt Cutts in January 2014. Cutts, no longer at Google, was then the head of its web spam fighting team. His declaration was a shot heard around the web. Guest blogging almost became radioactive. No one seemed to want to touch it, much less send out idiotic bulk emails requesting a post.

Those requests are back in force. It’s a pity that so many come from Google’s own Gmail system, where all Google’s vaunted machine learning doesn’t catch them as the spam they are.

If you’ve been making such requests or accepting guest blog posts because of them, even in small scale, Google’s rules about policing links still apply.

Google is extending in-market audience targeting to Search campaigns

This article first appeared on SearchEngineLand and was written by Ginny Marvin (@ginnymarvin).

Advertisers will be able to target users based on purchase intent signals in Search campaigns for more than a dozen categories.

Google is continuing to extend its audience targeting capabilities into Search. The company announced Tuesday that In-market audiences, currently only available for Display Network and YouTube campaigns, will be coming to Search campaigns.

Google shared the news in a blog post released ahead of its annual live-streamed event, Google Marketing Next.

First introduced in 2013 under the name In-market buyers, the targeting is aimed at reaching consumers who are getting ready to make a purchase, based on an analysis of intent signals such as recent search queries and website browsing activity. From today’s blog post:

For example, if you’re a car dealership, you can increase your reach among users who have already searched for “SUVs with best gas mileage” and “spacious SUVs.”

There are currently more than a dozen In-market audiences available in AdWords to target users looking to buy things such as apparel, baby products, event tickets or real estate.

 

Along with similar audiences for Search and Shopping, the addition of these targeting options marks Google’s shift to tapping user search history for targeting in Search campaigns. It does so in an aggregated, anonymized way, but the company had long-resisted incorporating that data in Search targeting for privacy reasons. Then Facebook came along and advertiser expectations — and some would say consumer acceptance — of targeting capabilities changed with it.

It’s not clear what the timing will be on the rollout. It took roughly a year for similar audiences for Search and Shopping to roll out generally after Google first announced it at last year’s live-streamed event.

5 Reasons Why Your Business Needs to Start Making Vertical Video for Social Media

First published on SocialMedia Today by Andrew Macarthy.

Does your business record vertical videos for social media?

In years gone by, recording and uploading video with the camera held vertically was looked upon with ridicule, producing big black bars either side of the picture and a narrow viewing angle, guaranteed to turn viewers off.

But times are changing. 

In this post, I’m going to lay out five reasons why your business should be experimenting with vertical video for social media marketing in 2017, and the potential benefits it can bring.

1. People naturally hold their phones vertically 

Obvious, but important.

If we strip smartphones back to their most basic function – giving users the ability to make and receive phone calls – the design of modern smarphones simply follows the tradition of “dumb” phones from decades past; that the device should be held vertically so that the user can speak and listen with minimal fuss. TV and cinema, meanwhile – the dominant visual media for so long – have demanded that the picture is viewed horizontally for the best experience. And so, despite all the things smartphones can now do, we’re historically conditioned to hold phones vertically and view video horizontally.

We’ve been stuck between two competing worlds, but times are changing.

For some hard facts, look to the MOVR Mobile Overview Report from December 2014, which found, unsuprisingly, that smartphone users hold their phones vertically about 94% of the time.

5 Reasons Why Your Business Needs to Start Making Vertical Video for Social Media | Social Media Today

Image credit: Form Meets Function

With the huge increase in mobile-recorded video content online in recent years, it’s no surprise that vertical device usage also on the up.

A 2016 study by KPCB Research showed that people in the US are now spending 29% of time using vertically-held devices, up from just 5% in 2010. And since people are holding these devices vertically for most tasks, it makes sense that they’ll play video that way, too. 

2. People access social media on mobile the most

5 Reasons Why Your Business Needs to Start Making Vertical Video for Social Media | Social Media Today

Not only are the vast majority of mobile apps designed with the assumption that users will be interacting while holding their smartphone vertically, but its increasingly where people are spending their time on social media.

comScore’s 2016 U.S. Cross-Platform Future in Focus study showed that nearly 80% of social media use now occurs on mobile devices – 61% on smartphones alone.

In addition, by 2018 in the US, the gap between desktop and mobile Internet use is predicted to grow to people spending 3 hours and 20 minutes using the Internet on their phone, compared to just 40 minutes on the computer.

As a business, your content needs to be built in a way that caters best to how it is being consumed.

3. Social networks are vertical video-friendly

If we look at how vertical video renders on social apps, we see something interesting.

As of February 2017:

  • Facebook – Vertical videos publish with no black borders
  • Instagram – Vertical videos publish with no black borders
  • Snapchat – Vertical videos publish with no black borders
  • Twitter – Vertical videos publish with no black borders
  • YouTube – The Android version of the app hides black borders when device is held vertically and video is viewed in full screen

Rather than looking upon vertical video as a negative, social networks – even YouTube – are embracing the format. There’s no way they’re going to be able force people to film in landscape mode without really annoying them(as YouTube used to do), so why not make the viewing experience (inferior to landscape mode as it may be) as optimum as it can be?

And there are other benefits. When it comes to watching live video, viewers holding their phones vertically can engage with reactions and comments in a way that’s natural to them.

Rolling out the 2:3 ratio of vertical video on Facebook in August 2016 (rather than cropping vertical videos into squares), a Facebook representative told Marketing Land at the time that:

“We know that people enjoy more immersive experiences on Facebook, so we’re starting to display a larger portion of each vertical video in News Feed on mobile.”

Facebook wants people to stay on its platform, so it will do whatever it can to suit their needs.

Instagram’s roll-out of vertical video last year shared a similar sentiment.

In a blog post to announce its Stories feature (boasting over 150 million users as of January 2017), Instagram said:

“Square format has been and always will be part of who we are. That said, the visual story you’re trying to tell should always come first, and we want to make it simple and fun for you to share moments the way you want to.”

4. Vertical video ads convert better

5 Reasons Why Your Business Needs to Start Making Vertical Video for Social Media | Social Media Today

Vertical video ads are growing in favor with advertisers as well – and when you understand that it’s the orientation in which users are increasingly consuming video, you see why.

“From a storytelling perspective, this is obviously more exciting,” said Dan Grossman, VP of platform partnerships at VaynerMedia told Mashable. “If we can take up more of the screen that means you’re less distracted. We can capture more of the viewer’s attention.”

In another recent development – the launch of vertical video ads for Instagram Stories in January 2017 – Instagram seemed open to their success: 

“Since the beginning, we’ve been thoughtful about rolling out ads on Instagram to give businesses and consumers the best experience possible. And ad formats are no exception. Portrait has long been available on the platform for posts, and is a common format for consuming mobile content”.

5 Reasons Why Your Business Needs to Start Making Vertical Video for Social Media | Social Media Today

For more evidence, look to Snapchat, who you might say spearheaded the vertical video revolution in social media.

In a pitch to publishers in 2015, Snapchat reported that full screen vertical video ad completion rates were 9x higher than those of horizontal video ads.

The company’s internal research also shows that vertical video ads draw up to 2x higher visual attention vs. comparable platforms.

“Communicate your brand message in a way that fits your phone, the way Snapchatters actually use it.”

5 Reasons Why Your Business Needs to Start Making Vertical Video for Social Media | Social Media Today

In addition, Jason Stein, CEO of Laundry Service, reported success with LG vertical video ads soon after their launch last year. He told Adweek that his had had been receiving CPM rates that were 3x more efficient than standard square videos on Facebook.

5. Your customers are lazy

Whisper it, but it’s true – and as customers ourselves, we’re all guilty of it.

Think of it like this: when users are zipping through mobile sites and social media feeds, they expect the experience to be seamless.

If your video plays in landscape and can be viewed okay, not many people are going to make the effort to turn their phone 90 degrees and tap to expand to full screen. It’s lazy, but it’s the truth.

As a marketer, that means you’re missing out on filling a users’ screen with your ad and keeping their full attention as effectively as possible.

Perhaps this point is best summed up by Zena Barakat, a former New York Times video producer who spent a year researching vertical video. She discovered that many people didn’t reorient their phones to watch horizontal videos in full-screen mode.

“As a person who makes videos, I was like, ‘You’re not seeing it the way we intended it! And they were like, ‘We don’t care!’ They found it so uncomfortable to hold the phone the other way, and they didn’t want to keep switching their phones back and forth.’”

Over to you

What are your thoughts on the vertical video revolution? Will you be experimenting with vertical video on social for your brand?

Get to Know Fred and Modern Google SEO

This article first appeared in WebsiteMagazine by Peter Prestipino.

In late 2016, many in the search engine optimization (SEO) industry knew something big was coming. 

Right on cue, Google made a rather significant update to its search algorithm in early March 2017, code named “Fred,” and the resulting impact is now encouraging many in the digital marketing community to rethink not just their approach to the practice of optimization for search, but also the experience they develop in general.

Let’s take a closer look at the recent update (review a timeline of significant Google algorithm changes over the years at wsm.co/algotime) and get to know “Fred,” the current likes and dislikes of Google when it comes to the Web experience and some current best practices for consistently generating more organic traffic to websites.

Fred’s Focus

Google has acknowledged that it actually makes hundreds of updates each year to its core algorithm (it is being reported, in fact, that Google even released a few other updates at the same time Fred appeared), but its most recent has left many search marketers low on traffic and high on questions.

Fortunately, most updates of this magnitude (and pretty much all updates for that matter) tend to focus on the same variables; either links or content.

In essence, in some way or another, if organic traffic was impacted it was because the site was in violation of the webmaster guidelines on quality. While there is no way to know for sure what the focus of Fred may have been, based on conversations with other search marketers and through some search result research, it appears that the vast majority of the sites impacted seem to be content related and those which were negatively impacted had two things in common – their content was “shallow” and their website prioritized advertising over the experience of the user.

This is, of course, speculation (although informed speculation), but the impact has been quite substantial for sites that leverage a model or approach where advertising in its variety of forms encroaches on the digital experience of the user.

The Aftermath

Just how bad was the Fred update for these types of websites? Some SEOs and webmasters have actually reported 50-90 percent reductions in their organic traffic from Google. As one can imagine, that sort of drop in traffic is serious, but there are some things search marketers can do, and some things that they most certainly should not if they were the focus of this update.

The last thing companies want to do, for example, is to panic, deleting pages without reason or modifying URL structures. While it is possible in the experience of many to get some traffic back over time, most of those that have employed tactics outside the guidance of what’s available in Google’s Webmaster guidelines, are more likely to simply abandon their sites instead of put in the required work to fix their mistakes and get on the right track.

Necessary Steps

Should a website be one of those impacted, and should an enterprise be committed to regaining its rankings and resulting organic traffic, there are some steps that can be taken. If advertising is indeed the reason, consider how the digital property is monetized.

Are there simply too many ad units on the page? Too many popups still appearing for mobile users? A change related to how websites generate revenue may be in order. Should shallow content be to blame, there are also some corrective actions that can be taken, including identifying those pages which suffered a reduction in traffic, and including additional relevant content that is useful to users.

It seems so simple in theory – and it is. The tactical side of course is far more complex, but making these changes on a strategic level will be increasingly necessary if success is in the future plan.

 

How to Generate Content Ideas Using Screaming Frog in 20(ish) Minutes

by Todd McDonald and first published on Moz.com.

A steady rise in content-related marketing disciplines and an increasing connection between effective SEO and content has made the benefits of harnessing strategic content clearer than ever. However, success isn’t always easy. It’s often quite difficult, as I’m sure many of you know.

A number of challenges must be overcome for success to be realized from end-to-end, and finding quick ways to keep your content ideas fresh and relevant is invaluable. To help with this facet of developing strategic content, I’ve laid out a process below that shows how a few SEO tools and a little creativity can help you identify content ideas based on actual conversations your audience is having online.

What you’ll need

Screaming Frog: The first thing you’ll need is a copy of Screaming Frog (SF) and a license. Fortunately, it isn’t expensive (around $150/USD for a year) and there are a number of tutorials if you aren’t familiar with the program. After you’ve downloaded and set it up, you’re ready to get to work.

Google AdWords Account: Most of you will have access to an AdWords account due to actually running ads through it. If you aren’t active with the AdWords system, you can still create an account and use the tools for free, although the process has gotten more annoying over the years.

Excel/Google Drive (Sheets): Either one will do. You’ll need something to work with the data outside of SF.

Browser: We walk through the examples below utilizing Chrome.

The concept

One way to gather ideas for content is to aggregate data on what your target audience is talking about. There are a number of ways to do this, including utilizing search data, but it lags behind real-time social discussions, and the various tools we have at our disposal as SEOs rarely show the full picture without A LOT of monkey business. In some situations, determining intent can be tricky and require further digging and research. On the flipside, gathering information on social conversations isn’t necessarily that quick either (Twitter threads, Facebook discussion, etc.), and many tools that have been built to enhance this process are cost-prohibitive.

But what if you could efficiently uncover hundreds of specific topics, long-tail queries, questions, and more that your audience is talking about, and you could do it in around 20 minutes of focused work? That would be sweet, right? Well, it can be done by using SF to crawl discussions that your audience is having online in forums, on blogs, Q&A sites, and more.

Still here? Good, let’s do this.

The process

Step 1 – Identifying targets

The first thing you’ll need to do is identify locations where your ideal audience is discussing topics related to your industry. While you may already have a good sense of where these places are, expanding your list or identifying sites that match well with specific segments of your audience can be very valuable. In order to complete this task, I’ll utilize Google’s Display Planner. For the purposes of this article, I’ll walk through this process for a pretend content-driven site in the Home and Garden vertical.

Please note, searches within Google or other search engines can also be a helpful part of this process, especially if you’re familiar with advanced operators and can identify platforms with obvious signatures that sites in your vertical often use for community areas. WordPress and vBulletin are examples of that.

Google’s Display Planner

Before getting started, I want to note I won’t be going deep on how to use the Display Planner for the sake of time, and because there are a number of resources covering the topic. I highly suggest some background reading if you’re not familiar with it, or at least do some brief hands-on experimenting.

I’ll start by looking for options in Google’s Display Planner by entering keywords related to my website and the topics of interest to my audience. I’ll use the single word “gardening.” In the screenshot below, I’ve selected “individual targeting ideas” from the menu mid-page, and then “sites.” This allows me to see specific sites the system believes match well with my targeting parameters.

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:qJyinA:Google Chrome.png

I’ll then select a top result to see a variety of information tied to the site, including demographics and main topics. Notice that I could refine my search results further by utilizing the filters on the left side of the screen under “Campaign Targeting.” For now, I’m happy with my results and won’t bother adjusting these.

Step 2 – Setting up Screaming Frog

Next, I’ll take the website URL and open it in Chrome.

Once on the site, I need to first confirm that there’s a portion of the site where discussion is taking place. Typically, you’ll be looking for forums, message boards, comment sections on articles or blog posts, etc. Essentially, any place where users are interacting can work, depending on your goals.

In this case, I’m in luck. My first target has a “Gardening Questions” section that’s essentially a message board.

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:f8grAc:Google Chrome.png

A quick look at a few of the thread names shows a variety of questions being asked and a good number of threads to work with. The specific parameters around this are up to you — just a simple judgment call.

Now for the fun part — time to fire up Screaming Frog!

I’ll utilize the “Custom Extraction” feature found here:

Configuration → Custom → Extraction

…within SF (you can find more details and broader use-case documentation set for this feature here). Utilizing Custom Extraction will allow me to grab specific text (or other elements) off of a set of pages.

Configuring extraction parameters

I’ll start by configuring the extraction parameters.

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:6CLiO7:SEOSpiderUI.png

In this shot I’ve opened the custom extraction settings and have set the first extractor to XPath. I need multiple extractors set up, because multiple thread titles on the same URL need to be grabbed. You can simply cut and paste the code into the next extractors — but be sure to update the number sequence (outlined in orange) at the end to avoid grabbing the same information over and over.

Notice as well, I’ve set the extraction type to “extract text.” This is typically the cleanest way to grab the information needed, although experimentation with the other options may be required if you’re having trouble getting the data you need.

Tip: As you work on this, you might find you need to grab different parts of the HTML than what you thought. This process of getting things dialed can take some trial-and-error (more on this below).

Grabbing Xpath code

To grab the actual extraction code we need (visible in the middle box above):

  1. Use Chrome
  2. Navigate to a URL with the content you want to capture
  3. Right-click on the text you’d like to grab and select “inspect” or “inspect element”

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:x5zaHV:Google Chrome.png

Make sure you see the text you want highlighted in the code view, then right-click and select “XPath” (you can use other options, but I recommend reviewing the SF documentation mentioned above first).

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:KGwqPz:Google Chrome.png

It’s worth noting that many times, when you’re trying to grab the XPath for the text you want, you’ll actually need to select the HTML element one level above the text selected in the front-end view of the website (step three above).

At this point, it’s not a bad idea to run a very brief test crawl to make sure the desired information is being pulled. To do this:

  1. Start the crawler on the URL of the page where the XPath information was copied from
  2. Stop the crawler after about 10–15 seconds and navigate to the “custom” tab of SF, set the filter to “extraction” (or something different if you adjusted naming in some way), and look for data in the extractor fields (scroll right). If this is done right, I’ll see the text I wanted to grab next to one of the first URLs crawled. Bingo.

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:fDZAyI:SEOSpiderUI.pngResolving extraction issues & controlling the crawl

Everything looks good in my example, on the surface. What you’ll likely notice, however, is that there are other URLs listed without extraction text. This can happen when the code is slightly different on certain pages, or SF moves on to other site sections. I have a few options to resolve this issue:

  1. Crawl other batches of pages separately walking through this same process, but with adjusted XPath code taken from one of the other URLs.
  2. Switch to using regex or another option besides XPath to help broaden parameters and potentially capture the information I’m after on other pages.
  3. Ignore the pages altogether and exclude them from the crawl.

In this situation, I’m going to exclude the pages I can’t pull information from based on my current settings and lock SF into the content we want. This may be another point of experimentation, but it doesn’t take much experience for you to get a feel for the direction you’ll want to go if the problem arises.

In order to lock SF to URLs I would like data from, I’ll use the “include” and “exclude” options under the “configuration” menu item. I’ll start with include options.

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:6scUuu:SEOSpiderUI.png

Here, I can configure SF to only crawl specific URLs on the site using regex. In this case, what’s needed is fairly simple — I just want to include anything in the /questions/ subfolder, which is where I originally found the content I want to scrape. One parameter is all that’s required, and it happens to match the example given within SF ☺:

The “excludes” are where things get slightly (but only slightly) trickier.

During the initial crawl, I took note of a number of URLs that SF was not extracting information from. In this instance, these pages are neatly tucked into various subfolders. This makes exclusion easy as long as I can find and appropriately define them.

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:fuqMmV:SEOSpiderUI.png

In order to cut these folders out, I’ll add the following lines to the exclude filter:

Upon further testing, I discovered I needed to exclude the following folders as well:

It’s worth noting that you don’t HAVE to work through this part of configuring SF to get the data you want. If SF is let loose, it will crawl everything within the start folder, which would also include the data I want. The refinements above are far more efficient from a crawl perspective and also lessen the chance I’ll be a pest to the site. It’s good to play nice.

Completed crawl & extraction example

Here’s how things look now that I’ve got the crawl dialed:

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:MjDfb8:SEOSpiderUI.png

Now I’m 99.9% good to go! The last crawl configuration is to reduce speed to avoid negatively impacting the website (or getting throttled). This can easily be done by going to Configuration → Speed and reducing the number of threads and URIs that can be crawled. I usually stick with something at or under 5 threads and 2 URIs.

Step 3 – Ideas for analyzing data

After the end goal is reached (run time, URIs crawled, etc.) it’s time to stop the crawl and move on to data analysis. There a number of ways to start breaking apart the information grabbed that can be helpful, but for now I’ll walk through one approach with a couple of variations.

Identifying popular words and phrases

My objective is to help generate content ideas and identify words and phrases that my target audience is using in a social setting. To do that, I’ll use a couple of simple tools to help me break apart my information:

The top two URLs perform text analysis, with some of you possibly already familiar with the basic word-cloud generating abilities of tagcrowd.com. Online-Utility won’t pump out pretty visuals, but it provides a helpful breakout of common 2- to 8-word phrases, as well as occurrence counts on individual words. There are many tools that perform these functions; find the ones you like best if these don’t work!

I’ll start with Tagcrowd.com.

Utilizing Tagcrowd for analysis

The first thing I need to do is export a .csv of the data scraped from SF and combine all the extractor data columns into one. I can then remove blank rows, and after that scrub my data a little. Typically, I remove things like:

  • Punctuation
  • Extra spaces (the Excel “trim” function often works well)
  • Odd characters

Now that I’ve got a clean data set free of extra characters and odd spaces, I’ll copy the column and paste it into a plain text editor to remove formatting. I often use the one online at editpad.org.

That leaves me with this:

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:AQjpqU:Google Chrome.png

In Editpad, you can easily copy your clean data and paste it into the entry box on Tagcrowd. Once you’ve done that, hit visualize and you’re there.

Tagcrowd.com

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:SeqYtU:Google Chrome.png

There are a few settings down below that can be edited in Tagcrowd, such as minimum word occurrence, similar word grouping, etc. I typically utilize a minimum word occurrence of 2, so that I have some level of frequency and cut out clutter, which I’ve used for this example. You may set a higher threshold depending on how many words you want to look at.

For my example, I’ve highlighted a few items in the cloud that are somewhat informational.

Clearly, there’s a fair amount of discussion around “flowers,” seeds,” and the words “identify” and “ID.” While I have no doubt my gardening sample site is already discussing most of these major topics such as flowers, seeds, and trees, perhaps they haven’t realized how common questions are around identification. This one item could lead to a world of new content ideas.

In my example, I didn’t crawl my sample site very deeply and thus my data was fairly limited. Deeper crawling will yield more interesting results, and you’ve likely realized already how in this example, crawling during various seasons could highlight topics and issues that are currently important to gardeners.

It’s also interesting that the word “please” shows up. Many would probably ignore this, but to me, it’s likely a subtle signal about the communication style of the target market I’m dealing with. This is polite and friendly language that I’m willing to bet would not show up on message boards and forums in many other verticals ☺. Often, the greatest insights besides understanding popular topics from this type of study are related to a better understanding of communication style, phrasing, and more that your audience uses. All of this information can help you craft your strategy for connection, content, and outreach.

Utilizing Online-Utility.org for analysis

Since I’ve already scrubbed and prepared my data for Tagcrowd, I can paste it into the Online-Utility entry box and hit “process text.”

After doing this, we ended up with this output:

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:F9LpWN:Google Chrome.png

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:mAMxCq:Google Chrome.png

There’s more information available, but for the sake of space, I’ve grabbed only a couple of shots to give you the idea of most of what you’ll see.

Notice in the first image, the phrases “identify this plant” & “what is this” both show up multiple times in the content I grabbed, further supporting the likelihood that content developed around plant identification is a good idea and something that seems to be in demand.

Utilizing Excel for analysis

Let’s take a quick look at one other method for analyzing my data.

One of the simplest ways to digest the information is in Excel. After scrubbing the data and combining it into one column, a simple A→Z sort, puts the information in a format that helps bring patterns to light.

ssd:private:var:folders:m2:wh1vdy452ps54mq15f_w0jlh0000gn:T:EXDvV1:Microsoft Excel.png

Here, I can see a list of specific questions ripe for content development! This type of information, combined with data from tools such as keywordtool.io, can help identify and capture long-tail search traffic and topics of interest that would otherwise be hidden.

Tip: Extracting information this way sets you up for very simple promotion opportunities. If you build great content that answers one of these questions, go share it back at the site you crawled! There’s nothing spammy about providing a good answer with a link to more information if the content you’ve developed is truly an asset.

It’s also worth noting that since this site was discovered through the Display Planner, I already have demographic information on the folks who are likely posting these questions. I could also do more research on who is interested in this brand (and likely posting this type of content) utilizing the powerful ad tools at Facebook.

This information allows me to quickly connect demographics with content ideas and keywords.

While intent has proven to be very powerful and will sometimes outweigh misaligned messaging, it’s always great to know as much about who you’re talking to and be able to cater messaging to them.

Wrapping it up

This is just the beginning and it’s important to understand that.

The real power of this process lies in its usage of simple, affordable, tools to gain information efficiently — making it accessible to many on your team, and an easy sell to those that hold the purse strings no matter your organization size. This process is affordable for mid-size and small businesses, and is far less likely to result in waiting on larger purchases for those at the enterprise level.

What information is gathered and how it is analyzed can vary wildly, even within my stated objective of generating content ideas. All of it can be right. The variations on this method are numerous and allow for creative problem solvers and thinkers to easily gather data that can bring them great insight into their audiences’ wants, needs, psychographics, demographics, and more.

Be creative and happy crawling!

5 Ways for Job-Seeking Millennials to Clean Up Their Social Media Profiles Today

by Christie Carton and first published on Recruiter.com

Graduation has come and gone. If you’re like so many young people today who were unable to secure professional employment in the field of their choice before leaving college, you’re likely still hunting for those ideal job postings, submitting applications, and going on as many interviews as possible.

Resume in order? Check. Networking events attended? Check. Social media accounts cleaned up? Hmm.

If you haven’t done so already, you might want to seriously rethink what you’ve put out into the social media universe as well. This, believe it or not, is a critical part of the job search.

A recent survey conducted by my nonprofit, the 1,000 Dreams Fund, via Toluna Quicksurveys found that half of job seekers polled between the ages of 18 and 25 don’t plan to clean up their social media profiles before applying for jobs. This is a big mistake, especially given that employers say they use social media to screen and possibly eliminate candidates, according to another recent survey.

The bottom line is this: Don’t let some social media goof overpower your stellar application and prevent you from becoming the next promising employee at the company of your dreams!

Here are five tried-and-true tips from other successful grads about cleaning up your social media profile during the all-important job hunt!

1. Google Yourself

Search yourself to see what comes up. Be sure to dig deep and see what each page contains. What you see may surprise you – and it’s the quickest way for you to gauge what employers are seeing.

2. Keep It Private!

Depending on what you find during your Google search, it may be a good idea to make your Facebook profile private so that only those in your network of friends can see all the fun you had in school.

3. Delete, Delete, Delete!

Your employer can access pretty much anything online. If you wouldn’t want them to see a specific post, tweet, or picture, delete it. If you find something on a third-party site you don’t want out there, reach out to the publisher or editor to see if they’ll remove the post. In most cases, they will, especially if you are clear that it could impact your ability to find a job.

4. Keep it PG

Getting ready to post an update, or maybe a pic from that girls’ night out? If it’s something you wouldn’t want your teenage cousin or grandmother to see, you should probably reconsider! At the end of the day, there’s no way to gauge who is looking at your pictures or posts, so you should be sure to avoid posting anything controversial.

5. Leave It to the Pros

Cleaning up your social media presence can be a time-consuming process, so it’s important to know that there are professional “scrubbing” services you can lean on. These services are especially useful when you’re dealing with something that’s hard to remove, because they pride themselves on cleaning up messy digital footprints.

Social-media-INFOGRAPHIC (1)

Christie Garton is an award-winning social entrepreneur, author, and creator of the 1,000 Dreams Fund.

Google: Short Articles Won’t Penalize Your Site; Think About Users

by Barry Schwartz and first published on Search Engine Roundtable.

Google’s John Mueller covered lots and lots of myths this past Friday in the Google Hangout on Google+. He said at the 34:37 minute mark that having short articles won’t give you a Google penalty. He also said that even some long articles can be confusing for users. He said that short articles can be great and long articles can be great – it is about your users, not search engines.

The question posed was:

My SEO agency told me that the longer the article I write, the more engaged the user should be or the Google will penalize me for this. I fear writing longer articles with lots of rich media inside because of this, is my SEO agency correct or not?

Back in 2012, Google said short articles can rank well and then again in 2014 said short articles are not low quality. John said in 2016:

So I really wouldn’t focus so much on the length of your article but rather making sure that you’re actually providing something useful and compelling for the user. And sometimes that means a short article is fine, sometimes that means a long article with lots of information is fine.So that’s something that you essentially need to work out between you and your users.

From our point of view we don’t have an algorithm that council words on your page and says, oh everything until a hundred words is bad everything between 200 and 500 is fine and over 500 needs to have five pictures. We don’t look at it like that.

We try to look at the pages overall and make sure that this is really a compelling and relevant search results to users. And if that’s the case then that’s perfectly fine. If that’s long or short or lots of images or not, that’s essentially up to you.

Sometimes I think long articles can be a bit long winding and my might lose people along the way. But sometimes it’s really important to have a long article with all of the detailed information there. That’s really something that maybe it’s worth double checking with your user is doing some a/b testing with them. Maybe getting their feedback in other ways are like sometimes you can put like the stars on the page do you have a review that or use maybe Google consumer surveys to get a quick kind of a sample of how your users are reacting to that content. But that’s really something between you and your users and not between you and and Google search engine from that point of view.

I specifically did the Google Consumer Surveys approach when I was hit by the Panda 4.1 update, which I recovered from on Panda 4.2. I even published my results for all to seeover here and it showed, people, my readers, like my short content.

So it really isn’t about how short, tall, long or detailed you are. As long as the content satisfies the user, Google should be satisfied too.

Here is the video embed: